Efficient Penetration Testing Path Planning Based on Reinforcement Learning with Episodic Memory

下载PDF

导出

摘要 Intelligent penetration testing is of great significance for the improvement of the security of information systems,and the critical issue is the planning of penetration test paths.In view of the difficulty for attackers to obtain complete network information in realistic network scenarios,Reinforcement Learning(RL)is a promising solution to discover the optimal penetration path under incomplete information about the target network.Existing RL-based methods are challenged by the sizeable discrete action space,which leads to difficulties in the convergence.Moreover,most methods still rely on experts’knowledge.To address these issues,this paper proposes a penetration path planning method based on reinforcement learning with episodic memory.First,the penetration testing problem is formally described in terms of reinforcement learning.To speed up the training process without specific prior knowledge,the proposed algorithm introduces episodic memory to store experienced advantageous strategies for the first time.Furthermore,the method offers an exploration strategy based on episodic memory to guide the agents in learning.The design makes full use of historical experience to achieve the purpose of reducing blind exploration and improving planning efficiency.Ultimately,comparison experiments are carried out with the existing RL-based methods.The results reveal that the proposed method has better convergence performance.The running time is reduced by more than 20%.

作者 Ziqiao Zhou Tianyang Zhou Jinghao Xu Junhu Zhu

机构地区 Henan Key Laboratory of Information Security School of Cryptographic Engineering

出处《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第9期2613-2634,共22页 工程与科学中的计算机建模（英文）

关键词 Intelligent penetration testing penetration testing path planning reinforcement learning episodic memory exploration strategy

分类号 P61 [天文地球—矿床学]

引文网络
相关文献

参考文献4

1周仕承,刘京菊,钟晓峰,卢灿举.基于深度强化学习的智能化渗透测试路径发现[J].计算机科学,2021,48(7):40-46. 被引量：15
2高文龙,周天阳,朱俊虎,赵子恒.基于双向蚁群算法的网络攻击路径发现方法[J].计算机科学,2022,49(S01):516-522. 被引量：5
3Tairan Hu,Tianyang Zhou,Yichao Zang,Qingxian Wang,Hang Li.APU-D* Lite: Attack Planning under Uncertainty Based on D* Lite[J].Computers, Materials & Continua,2020(11):1795-1807. 被引量：2
4赵海妮,焦健.基于强化学习的渗透路径推荐模型[J].计算机应用,2022,42(6):1689-1694. 被引量：7

二级参考文献9

1隋玲玲,陈雄.机器人在线路径规划改进算法[J].计算机工程与应用,2010,46(28):36-39. 被引量：3
2赵星宇,丁世飞.深度强化学习研究综述[J].计算机科学,2018,45(7):1-6. 被引量：62
3陈若男,文聪聪,彭玲,尤承增.改进A~*算法在机器人室内路径规划中的应用[J].计算机应用,2019,39(4):1006-1011. 被引量：51
4郑本立,李跃辉.基于改进蚁群算法的SDN网络负载均衡研究[J].计算机科学,2019,46(B06):291-294. 被引量：11
5Tian-yang ZHOU,Yi-chao ZANG,Jun-hu ZHU,Qing-xian WANG.NIG-AP: a new method for automated penetration testing[J].Frontiers of Information Technology & Electronic Engineering,2019,20(9):1277-1288. 被引量：7
6杨惟轶,白辰甲,蔡超,赵英男,刘鹏.深度强化学习中稀疏奖励问题研究综述[J].计算机科学,2020,47(3):182-191. 被引量：37
7臧艺超,周天阳,朱俊虎,王清贤.领域独立智能规划技术及其面向自动化渗透测试的攻击路径发现研究进展[J].电子与信息学报,2020,42(9):2095-2107. 被引量：10
8李腾,曹世杰,尹思薇,魏大卫,马鑫迪,马建峰.应用Q学习决策的最优攻击路径生成方法[J].西安电子科技大学学报,2021,48(1):160-167. 被引量：13
9Jingang Cao.Robot Global Path Planning Based on an Improved Ant Colony Algorithm[J].Journal of Computer and Communications,2016,4(2):11-19. 被引量：20

共引文献20

1黄炜,葛培,李萌,许洪飞.混杂纤维再生砖骨料混凝土正交试验及卷积神经网络预测分析[J].材料导报,2021,35(19):19022-19029. 被引量：4
2周云,刘月华.基于深度强化学习的智能网络安全防护研究[J].通信技术,2021,54(11):2545-2550. 被引量：5
3乔路.渗透测试在网络安全等级保护测评中的应用研究[J].科技创新与应用,2022,12(14):159-162. 被引量：4
4牛月坤,曹慧,田晨雨,李涛,吴昊天.基于机器学习的自动化渗透测试系统技术的研究[J].计算机测量与控制,2022,30(6):17-22. 被引量：2
5高文龙,周天阳,赵子恒,朱俊虎.基于深度强化学习的网络攻击路径规划方法[J].信息安全学报,2022,7(5):65-78. 被引量：2
6王震,李赛飞,张丽杰.基于强化学习的自动化红队测试计划构建与验证[J].信息安全与通信保密,2022(8):71-82.
7胡瑜洪,王德光,何家汉,张志恒.离散事件系统最优监督控制算法[J].计算机应用,2023,43(7):2271-2279.
8王子博,张耀方,陈翊璐,刘红日,王佰玲,王冲华.基于分层任务网络的攻击路径发现方法[J].计算机科学,2023,50(9):35-43.
9占力戈,沙乐天,肖甫,董建阔,张品昌.基于强化学习的自动化Windows域渗透方法[J].网络与信息安全学报,2023,9(4):104-120. 被引量：1
10马琦,刘杨,吴贤生,曲芸,王佰玲,刘红日.基于价值迭代算法的最优渗透路径发现[J].计算机系统应用,2023,32(12):197-204. 被引量：1

1Jun Li,Xuebin Huang.Estimating the expected value of multiple prospects in bidding blocks[J].Energy Geoscience,2022,3(3):263-269.
2Chengjie Li,Lidong Zhu,Zhen Zhang.For LEO Satellite Networks: Intelligent Interference Sensing and Signal Reconstruction Based on Blind Separation Technology[J].China Communications,2024,21(2):85-95.
3Igor Godefroy Kouam Kamdem,Marcellin Nkenlifack.Cyber Deception Using NLP[J].Journal of Information Security,2024,15(2):279-297.
4Hua-Lei Yin,Zeng-Bing Chen.Cost-efficient quantum access network boosts practical deployment of quantum key distribution network[J].Science China(Physics,Mechanics & Astronomy),2024,67(4):169-169.
5Ahmed Redha Mahlous.Security Analysis in Smart Agriculture: Insights from a Cyber-Physical System Application[J].Computers, Materials & Continua,2024,79(6):4781-4803.
6Sun Shiyun,Hu Zhengying,Wei Xin,Zhou Liang.An Adaptive Program Recommendation System for Multi-User Sharing Environment[J].China Communications,2024,21(6):112-128.
7Yang Liu,Fanyou Wu,Zhiyuan Liu,Kai Wang,Feiyue Wang,Xiaobo Qu.Can language models be used for real-world urban-delivery route optimization?[J].The Innovation,2023,4(6):92-100. 被引量：1
8梁振宇,王朝瑾,王阳阳,高皓琪,朱东涛,许颢砾,杨星.Efficient single-pixel imaging encrypted transmission based on 3D Arnold transformation[J].Chinese Physics B,2024,33(3):378-386.
9Xiong Xiao.Historical review and reflections on the participation of acupuncture and moxibustion in the treatment of epidemics in the People’s Republic of China (from 1950 until now)[J].History & Philosophy of Medicine,2024,6(3):1-5.
10杨九诠.From"Extensive"to"Connotative"Development:Modernization of Chinese-Style Education[J].ECNU Review of Education,2024,7(1):144-154.

Computer Modeling in Engineering & Sciences

2024年第9期

浏览历史

内容加载中请稍等...

Efficient Penetration Testing Path Planning Based on Reinforcement Learning with Episodic Memory

参考文献4

二级参考文献9

共引文献20

相关作者

相关机构

相关主题

浏览历史