期刊文献+

MADDPG算法经验优先抽取机制 被引量:11

Multi-agent deep deterministic policy gradient algorithm via prioritized experience selected method
原文传递
导出
摘要 针对多智能体深度确定性策略梯度算法(MADDPG)学习训练效率低、收敛速度慢的问题,研究MADDPG算法经验优先抽取机制,提出PES-MADDPG算法.首先,分析MADDPG算法的模型和训练方法;然后,改进多智能体经验缓存池,以策略评估函数误差和经验抽取训练频率为依据,设计优先级评估函数,以优先级作为抽取概率获取学习样本训练神经网络;最后,在合作导航和竞争对抗2类环境中进行6组对比实验,实验结果表明,经验优先抽取机制可提高MADDPG算法的训练速度,学习后的智能体具有更好的表现,同时对深度确定性策略梯度算法(DDPG)控制的多智能体训练具有一定的适用性. In order to mitigate the problem of low efficiency and slow convergence of the multi-agent deep deterministic policy gradient(MADDPG)algorithm,the prioritized experience selection mechanism of MADDPG algorithm is studied and PES-MADDPG algorithm is proposed.Firstly,the model and the training method of the MADDPG algorithm are analyzed,the multi-agent experience buffer pool is ameliorated,and the priority evaluation function is designed based on the error of critic function and the training frequency of experience.The priority is treated as the selection probability to obtain the learning sample for training neural network.Finally,six groups of comparative experiments are conducted in both cooperative navigation and competitive environment.The experiments results show that the prioritized experience selection mechanism improves the training speed of the MADDPG algorithm,and the trained agents have better performance.The prioritized experience selection mechanism also has certain applicability to the training of multi-agents controlled by the deep detcrministic policy gradient(DDPG)algorithm.
作者 何明 张斌 柳强 陈希亮 杨铖 HE Ming;ZHANG Bin;LIU Qiang;CHEN Xi-liang;YANG Cheng(College of Command and Control Engineering,The Army Engineering University of PLA,Nanjing 210007,China;Naval Command College,Nanjing 210000,China)
出处 《控制与决策》 EI CSCD 北大核心 2021年第1期68-74,共7页 Control and Decision
基金 国家重点研发计划项目(2018YFC0806900,2016YFC0800606,2016YFC0800310) 江苏省自然科学基金项目(BK20161469) 江苏省重点研发计划项目(BE2016904,BE2017616,BE2018754) 中国博士后基金项目(2018M633757).
关键词 多智能体 深度强化学习 MADDPG 经验优先抽取 multi-agent deep reinforcement learning MADDPG prioritized experience selected method
  • 相关文献

参考文献3

二级参考文献10

共引文献64

同被引文献68

引证文献11

二级引证文献44

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部