期刊文献+

融合两级注意力的多机器人强化学习导航

Multi-robot Reinforcement Learning Navigation Incorporating Two Levels of Attention
下载PDF
导出
摘要 针对多智能体强化学习中因智能体之间的复杂关系所导致的学习效率低及收敛速度慢的问题,提出基于两级注意力机制的方法MADDPG-Attention,在MADDPG算法的Critic网络中增加了软硬两级注意力机制,通过注意力机制学习智能体之间的可借鉴经验,提升智能体之间的相互学习效率.由于单层的软注意力机制会给完全不相关的智能体也赋予学习权重,因此采用硬注意力判断两个智能体之间学习的必要性,裁减无关信息的智能体,再用软注意力判断两个智能体间学习的重要性,按重要性分布来分配学习权重,据此向有可用经验的智能体学习.在多智能体粒子的合作导航环境上进行测试,实验结果表明,MADDPG-Attention算法对复杂关系的理解更为清晰,在3种环境的导航成功率都达到了90%以上,有效提高了学习效率,加快了收敛速度. To solve the low learning efficiency and slow convergence due to the complex relationship among intelligent agents in multi-agent reinforcement learning,this study proposes a two-level attention mechanism based on MADDPGAttention.The mechanism adds soft and hard attention mechanisms to the Critic network of the MADDPG algorithm and learns the learnable experience among intelligent agents through the attention mechanism to improve the mutual learning efficiency of the agents.Since the single-level soft attention mechanism assigns learning weights to completely irrelevant intelligent agents,hard attention is employed to determine the necessity of learning between two intelligent agents,and the agents with irrelevant information are cut.Then soft attention is adopted to determine the importance of learning between two intelligent agents,and the learning weights are assigned according to the importance distribution to learn from the agents with available experience.Meanwhile,tests on a collaborative navigation environment with multi-agent particles show that the MADDPG-Attention algorithm has a clearer understanding of complex relationships and achieves a success rate of more than 90%in all three environments,which improves the learning efficiency and accelerates the convergence rate.
作者 张耀丹 况立群 焦世超 韩慧妍 薛红新 ZHANG Yao-Dan;KUANG Li-Qun;JIAO Shi-Chao;HAN Hui-Yan;XUE Hong-Xin(School of Computer Science and Technology,North University of China,Taiyuan 030051,China;Shanxi Key Laboratory of Machine Vision and Virtual Reality North University of China),Taiyuan 030051,China;Shanxi Province’s Vision Information Processing and Intelligent Robot Engineering Research Center,Taiyuan 030051,China)
出处 《计算机系统应用》 2023年第12期43-51,共9页 Computer Systems & Applications
基金 国家自然科学基金(62272426,62106238) 山西省科技重大专项计划(202201150401021) 山西省科技成果转化引导专项(202104021301055) 山西省回国留学人员科研资助项目(2020-113) 山西省基础研究计划(202203021222027)。
关键词 多智能体强化学习 导航 MADDPG 硬注意力 软注意力 multi-agent reinforcement learning navigation MADDPG hard attention soft attention
  • 相关文献

参考文献4

二级参考文献12

共引文献46

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部