期刊文献+

基于知识辅助深度强化学习的巡飞弹组动态突防决策

Dynamic Penetration Decision of Loitering Munition Group Based on Knowledge-assisted Reinforcement Learning
下载PDF
导出
摘要 巡飞弹组(Loitering Munition Group,LMG)突防控制决策是提高巡飞弹群组作战自主性与智能性的关键。针对存在截击拦截器和临机防空火力区的动态环境中弹组突防机动指令在线生成困难的问题,提出一种基于知识辅助强化学习方法的LMG突防控制决策算法。结合领域知识、规则知识改进状态空间和回报函数设计提高算法泛化能力与训练收敛速度。构建基于软动作-评价方法的LMG突防控制决策框架,以提高算法探索效率。利用专家经验和模仿学习方法改善多弹多威胁带来的解空间狭窄、算法初始高效训练经验匮乏的问题。实验结果表明,新算法能够在动态环境中实时生成有效的突防机动指令,相较于对比方法效果更好,验证了算法的有效性。 The loitering munition group penetration control decision(LMGPCD)is the key to improve the autonomy and intelligence of loitering munition group combat.A knowledge-assisted reinforcement learning-based LMGPCD algorithm is proposed to solve the issue due to the difficult online generation of penetration maneuver command for loitering munition group in the dynamic environment containing interceptors and air defenses.The state space and reward function are improved by domain knowledge and rule knowledge to enhance the generalization ability and training convergence speed of the algorithm.A LMGPCD decision framework based on the soft actor-critic(SAC)algorithm is constructed to increase the exploration efficiency of the algorithm.An expert experience applying and imitation learning method is utilized against the lacking of initial efficient training experience for the algorithm due to the narrow solution space caused by increasing number of missiles and threats.The experimental results show that the proposed algorithm can generate more effective penetration maneuver command in real time in a dynamic environment compared to other algorithm,which verifies the effectiveness of the proposed algorithm.
作者 孙浩 黎海青 梁彦 马超雄 吴翰 SUN Hao;LI Haiqing;LIANG Yan;MA Chaoxiong;WU Han(School of Automation,Northwestern Polytechnical University,Xi'an 710072,Shaanxi,China;Xi'an Modern Control Technology Research Institute,Xi'an 710065,Shaanxi,China)
出处 《兵工学报》 EI CAS CSCD 北大核心 2024年第9期3161-3176,共16页 Acta Armamentarii
基金 国家自然科学基金项目(61873205)。
关键词 巡飞弹组 知识辅助深度强化学习 Soft Actor-Critic算法 动态环境突防 控制决策 loitering munition group knowledge-assisted deep reinforcement learning soft actor-critic algorithm dynamic environment penetration control decision
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部