期刊文献+

基于最大熵强化学习的最优渗透路径生成方法

Optimal Penetration Path Generation Based on Maximum Entropy Reinforcement Learning
下载PDF
导出
摘要 从攻击者角度分析入侵意图和渗透行为对于指导网络安全防御具有重要意义。然而,现有的渗透路径大多依据瞬时的网络环境构建,导致路径参考价值降低。针对该问题,文中提出了一种基于最大熵强化学习的最优渗透路径生成方法,该方法可以在网络环境动态变化的情况下,以探索的形式捕获多种模式的近似最优行为。首先,依据攻击图和漏洞评分对渗透过程进行建模,通过量化攻击获益来刻画渗透行为的威胁程度;然后,考虑到入侵行为的复杂性,开发基于最大熵模型的Soft Q-学习方法,通过控制熵值和奖励的重要程度来保证求解渗透路径的过程具有稳定性;最后将该方法应用于动态变化的测试环境中,生成高可用的渗透路径。仿真实验结果表明,相比于现有基于强化学习的基准方法,所提方法具有更强的环境适应性,能够以更低的代价生成更高收益的渗透路径。 Analyzing intrusion intentions and penetration behaviors from the attackers’perspective is of great significance for guiding network security defense.However,most existing penetration paths are constructed based on the instantaneous network environment,resulting in reduced reference value.Aiming at this problem,this paper proposes an optimal penetration path generation method based on maximum entropy reinforcement learning,which can capture the approximate optimal behavior of multiple modes in the form of exploration under dynamic network environments.Firstly,the penetration process is modeled according to the attack graph and the vulnerability score,and the threat degree of the penetration behavior is described by quantifying the attack benefits.Then,considering the complexity of the intrusion behavior,a soft Q-learning method based on the maximum entropy model is developed.The stability of the penetration path is ensured by controlling the entropy value and the importance of the reward.Finally,the method is applied to a dynamic environment to generate a highly available penetration path.Simulation experimental results show that,compared with the existing baseline methods based on reinforcement learning,the proposed method has more robust environmental adaptability and can generate higher-yielding penetration paths at a lower cost.
作者 王焱 王天荆 沈航 白光伟 WANG Yan;WANG Tianjing;SHEN Hang;BAI Guangwei(College of Computer Science and Technology,Nanjing Tech University,Nanjing 211816,China)
出处 《计算机科学》 CSCD 北大核心 2024年第3期360-367,共8页 Computer Science
基金 国家自然科学基金(61502230,61501224) 江苏省自然科学基金(BK20201357) 江苏省“六大人才高峰”高层次人才项目(RJFW-020)。
关键词 最大熵强化学习 攻击图 Soft Q-学习 渗透路径 Maximum entropy reinforcement learning Attack graph Soft Q-learning Penetration path
  • 相关文献

参考文献4

二级参考文献35

  • 1苘大鹏,张冰,周渊,杨武,杨永田.一种深度优先的攻击图生成方法[J].吉林大学学报(工学版),2009,39(2):446-452. 被引量:23
  • 2SHAH C. Zeus crime ware toolkit[EB/OL]. http://blogs.mcafee.com/ mcafeelabs/zeus-crimeware-toolkit.
  • 3QIN X, LEE W. Statistical causality of INFOSEC alert data[C]// Re-cent Advances in Intrusion Detection 2003. Berlin, 2003: 73-93.
  • 4VALEUR F, VIGNA G, KRUEGEL C, et al. A comprehensive ap-proach to intrusion detection alert correlation[J]. IEEE Trans. De-pendable and Secure Computing, 2004, 1(3): 146-169.
  • 5JAJODIA S, NOEL S, KALAPA P, et al. Cauldron: mission-centric cyber situational awareness with defense in depth[C]//The Military Communications Conference. Baltimore, 2011: 1339-1344.
  • 6YU D, FRINCKE D. Improving the quality of alerts and predicting intruder’s next goal with hidden colored petri-net[J]. Computer Net-works, 2007,51(3): 632-654.
  • 7WANG L, ISLAM T, LONG T, et al. An attack graph-based probabil-istic security metric[C]//Data and Applications Security XXII. Berlin Heidelberg, 2008: 283-296.
  • 8XIE P, LI J H, OU X M, et al. Using Bayesian networks for cyber security analysis[C]//The 40th IEEE/IFIP International Conference on Dependable Systems and Networks(DSN). Chicago, 2010: 211-220.
  • 9ABRAHAM S, NAIR S. A predictive framework for cyber security analytics using attack graphs[J]. International Journal of Computer Networks & Communications, 2015, 7(1): 1-17.
  • 10FREDJ O B. A realistic graph-based alert correlation system[J]. Secu-rity and Communication Network, 2015, 8(15): 2477-2493.

共引文献66

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部