期刊文献+

基于深度强化学习的高速飞行器攻防博弈 被引量:2

Attack-Defense Game based on Deep Reinforcement Learning for High Speed Vehicle
下载PDF
导出
摘要 针对高速飞行器与拦截器的攻防博弈问题,研究了一种基于双深度Q网络(DDQN)的改进算法。该算法针对经典DDQN样本利用效率低的问题,设置多个经验池,并将一轮对抗中Q值的累积时序差分误差(TD-error)与累积奖励值相结合,通过模糊推理计算样本存储至不同经验池中的概率。再根据累积奖励的时序差分误差设计积分抽样器,从不同经验池中抽取样本进行训练。模型的奖励函数设计原则为在成功突防的基础上减少自身机械能消耗。实验结果表明,相比于经典DDQN算法,改进算法能够有效提高样本利用效率,为解决高速飞行器机动突防问题提供了一种新思路。 Aiming at the attack-defense game between high speed aircraft and the interceptor, an improved DDQN is researched for high speed aircraft. The algorithm is aimed at the low utilization efficiency of sample in classical DDQN, by setting up multi-experience replay buffer,and combining accumulate Q-value temporal difference error(TD-error) with accumulate reward, the samples by fuzzy reasoning are classified and stored. Then, according to the training process, integral sampler and sampling form different experience replay buffer are designed The design principle of reward function is to reduce its mechanical energy consumption on the basis of successful penetration. The results show that the utilization efficiency of samples is improved by using this algorithm which provides a new idea to solve high speed aircraft maneuver penetration problem.
作者 何湘远 尘军 郭昊 余卓阳 田博 He Xiangyuan;Chen Jun;Guo Hao;Yu Zhuoyang;Tian Bo(Science and Technology on Space Physics Laboratory,Beijing 100076,China)
出处 《航天控制》 CSCD 北大核心 2022年第4期76-83,共8页 Aerospace Control
关键词 高速飞行器 拦截器 改进DDQN 模糊推理 攻防博弈 High speed aircraft Interceptor Improved DDQN Fuzzy reasoning Attack-defense game
  • 相关文献

参考文献12

二级参考文献93

共引文献558

同被引文献15

引证文献2

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部