期刊文献+

基于深度强化学习潜艇攻防对抗训练指挥决策研究 被引量:1

Research on Command Decision-making of Submarine Attack and Defense Confrontation Training Based on Deep Reinforcement Learning
下载PDF
导出
摘要 潜艇和水面舰艇编队间的攻防对抗是潜艇作战研究的重点内容,如何确保潜艇在舰艇编队、反潜直升机等兵力的联合封锁下存活和突围,是对潜艇指挥决策的考验。为此,针对潜舰机博弈对抗场景,从深度强化学习和规则推理两个方面构建潜艇智能体,提出两种近端策略优化(Proximal Policy Optimization,PPO)算法改进机制,开展互博弈对抗和分布式训练,最终实现潜艇在对抗过程中的智能决策,相关技术路线和算法在兵棋对战平台上得到实施和验证,算法改进后的收敛速度和稳定性有了较大提升,可为潜艇智能指挥决策的研究提供技术参考。 The offensive and defensive confrontation between the submarine and the surface ship formation is the key content of submarine combat research.How to ensure that the submarine survives and breaks through the joint blockade of the ship formation and anti-submarine helicopters is a test of the submarine command decision.To this end,in view of the asymmetry of the submarine-ship-helicopter confrontation scenario,the submarine agent is constructed from two aspects of deep reinforcement learning and rule inference,and two Proximal Policy Optimization(PPO)algorithm improvement mechanisms are proposed.It carries out mutual game confrontation and distributed training,and finally realizes the intelligent decision-making of submarines in the confrontation process.Related technical routes and algorithms have been implemented and verified on the wargaming platform.The improved algorithm has greatly improved the convergence speed and stability.The research on submarine intelligent command decision-making provides technical reference.
作者 郭洪宇 初阳 刘志 周玉芳 GUO Hong-yu;CHU Yang;LIU Zhi;ZHOU Yu-fang(Jiangsu Automation Research Institute, Lianyungang 222061, China)
出处 《指挥控制与仿真》 2022年第1期103-111,共9页 Command Control & Simulation
关键词 智能指挥决策 深度强化学习 近端策略优化算法 互博弈 intelligent command decision making deep reinforcement learning Proximal Policy Optimization mutual game confrontation
分类号 E917 [军事]
  • 相关文献

参考文献7

二级参考文献43

共引文献192

同被引文献3

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部