期刊文献+

基于深度强化学习的智能网络安全防护研究 被引量:5

Intelligent Cyber Security Defense based on Deep Reinforcement Learning
下载PDF
导出
摘要 人工智能(Artificial Intelligence,AI)的快速发展为网络空间安全对抗提供了新的思路和技术手段,然而AI在网络安全领域的应用将加剧网络攻防对抗的速度、烈度、复杂度。通过研究基于深度强化学习的网络空间智能安全防护,探索了网络空间安全防御智能化问题的解决方法和过程。此外使用深度学习提取网络安全态势数据特征,构建智能体,回报函数将网络攻击威胁度作为奖惩引导学习,强化学习判断策略和动作好坏,通过在虚拟网络空间综合靶场训练学习获得安全防护智能体和最优安全防护策略。 The rapid development of artificial intelligence(AI)provides new methods and technologies for cyber confrontation.However,the application of AI in the field of cyber security will make the cyber confrontation faster,more drastic and more complex.By researching intelligent cyber security defense based on deep reinforcement learning,the solution methods and processes of cyberspace security defense intelligent problems are explored.This paper uses deep learning to extract features of cyber security situation data and build an agent.The reward function uses cyber threat level as a motivation to guide learning,and strengthens learning to judge the strategies and the actions are good or bad.It obtains security protection agent and optimal security policies by training on cyber range in cloud platform.
作者 周云 刘月华 ZHOU Yun;LIU Yuehua(Unit 78111 of PLA,Chengdu Sichuan 21000,China;No.30 Institute of CETC,Chengdu Sichuan 610041,China)
出处 《通信技术》 2021年第11期2545-2550,共6页 Communications Technology
关键词 深度强化学习 网络安全防护 网络综合靶场 虚拟化 网络空间 deep reinforcement learning cyber security defense cyber range virtualization cyberspace
  • 相关文献

参考文献6

二级参考文献151

  • 1王三民,王宝树.贝叶斯网络在战术态势评估中的应用[J].系统工程与电子技术,2004,26(11):1620-1623. 被引量:21
  • 2史建国,高晓光,李相民.基于离散模糊动态贝叶斯网络的空战态势评估及仿真[J].系统仿真学报,2006,18(5):1093-1096. 被引量:29
  • 3MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-levelcontrol through deep reinforcement learning [J]. Nature, 2015,518(7540): 529 – 533.
  • 4SILVER D, HUANG A, MADDISON C, et al. Mastering the gameof Go with deep neural networks and tree search [J]. Nature, 2016,529(7587): 484 – 489.
  • 5AREL I. Deep reinforcement learning as foundation for artificialgeneral intelligence [M] //Theoretical Foundations of Artificial GeneralIntelligence. Amsterdam: Atlantis Press, 2012: 89 – 102.
  • 6TEAAURO G. TD-Gammon, a self-teaching backgammon program,achieves master-level play [J]. Neural Computation, 1994,6(2): 215 – 219.
  • 7SUTTON R S, BARTO A G. Reinforcement Learning: An Introduction[M]. Cambridge MA: MIT Press, 1998.
  • 8KEARNS M, SINGH S. Near-optimal reinforcement learning inpolynomial time [J]. Machine Learning, 2002, 49(2/3): 209 – 232.
  • 9KOCSIS L, SZEPESVARI C. Bandit based Monte-Carlo planning[C] //Proceedings of the European Conference on MachineLearning. Berlin: Springer, 2006: 282 – 293.
  • 10LITTMAN M L. Reinforcement learning improves behaviour fromevaluative feedback [J]. Nature, 2015, 521(7553): 445 – 451.

共引文献358

同被引文献51

引证文献5

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部