期刊文献+

基于改进型MADDPG的多智能体对抗策略算法 被引量:2

Multi-agent Confrontation Strategy Algorithm Based on Improved MADDPG
下载PDF
导出
摘要 探索深度强化学习在对抗作战策略上的应用,针对多智能体深度确定性策略梯度算法的局部可观测、训练较难收敛和稳定性较差的问题,分别引入长短时记忆神经网络、基于损失的优先级经验和策略梯度权重3种方法解决算法中对应问题,结合对抗作战决策场景与改进后的算法,设计3种决策实验场景。将算法与MADDPG、DDPG算法在多智能体模拟对抗实验环境中进行对比,结果表明算法在对抗决策的稳定性和效率上均有提升。 The application of deep reinforcement learning in the confrontation combat strategies is explored,aiming at the problems of partial observability,hard convergence in training,and poor stability of the Multi-Agent Deep Deterministic Policy Gradient(MADDPG)algorithm,three kinds of methods including Long Short-Term Memory(LSTM)network,priority experience replay based on loss,and policy gradient weight are introduced into the MADDPG algorithm to solve the corresponding problems,three kinds of decision-making experiment scenarios are designed with the combination of confrontation combat decision-making scenarios and the improved algorithm,and the proposed algorithm is compared with the MADDPG and DDPG algorithm in multi-agent simulation confrontation experiment environment.The results show that the proposed algorithm has improved the stability and efficiency of confrontation decision-making.
作者 刘鹏 赵建新 张宏映 高腾飞 闫涛 LIU Peng;ZHAO Jianxin;ZHANG Hongying;GAO Tengfei;YAN Tao(North Automatic Control Technology Institute,Taiyuan 030006,China)
出处 《火力与指挥控制》 CSCD 北大核心 2023年第3期132-138,145,共8页 Fire Control & Command Control
关键词 深度强化学习 对抗决策 长短时记忆网络 经验优先抽取 策略梯度 deep reinforcement learning confrontation decision-making long and shortterm memory networks experience priority extraction policy gradient
  • 相关文献

参考文献4

二级参考文献22

共引文献51

同被引文献15

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部