期刊文献+

竞争与合作视角下的多Agent强化学习研究进展

RECENT PROCESS AND PROSPECT OF MULTI-AGENT REINFORCEMENT LEARNING UNDER THE PERSPECTIVE OF COMPETITION AND COOPERATION
下载PDF
导出
摘要 随着深度学习和强化学习研究取得长足的进展,多Agent强化学习已成为解决大规模复杂序贯决策问题的通用方法。为了推动该领域的发展,从竞争与合作的视角收集并总结近期相关的研究成果。该文介绍单Agent强化学习;分别介绍多Agent强化学习的基本理论框架——马尔可夫博弈以及扩展式博弈,并重点阐述了其在竞争、合作和混合三种场景下经典算法及其近期研究进展;讨论多Agent强化学习面临的核心挑战——环境的不稳定性,并通过一个例子对其解决思路进行总结与展望。 With the rapid development of deep learning and reinforcement learning,multi-agent reinforcement learning(MARL)has become a common approach to solve the large scale complex sequential decision-making problem.In order to promote the development of this field,this paper collects and reviews recent research results from the perspective of competition and cooperation.This paper introduced deep reinforcement learning and introduced the basic theoretical framework of MARL-Markov game and extensive game,and especially emphasized the reinforcement learning algorithms developed recently in three scenarios of competition,cooperation and mixture.This paper discussed the core challenge of MARL that was non-stationary of the environment,and an example was given to summarize and prospect its solutions.
作者 田小禾 李伟 许铮 刘天星 戚骁亚 甘中学 Tian Xiaohe;Li Wei;Xu Zheng;Liu Tianxing;Qi Xiaoya;Gan Zhongxue(Academy for Engineering and Technology,Fudan University,Shanghai 200433,China;Shanghai Engineering Research Center of AI&Robotics,Shanghai 200433,China;Engineering Research Center of AI&Robotics,Ministry of Education,Shanghai 200433,China;Ji Hua Laboratory,Foshan 528000,Guangdong,China;Beijing Deep Singularity Technology Co.,Ltd.,Beijing 100089,China)
出处 《计算机应用与软件》 北大核心 2024年第4期1-15,共15页 Computer Applications and Software
基金 广东省季华实验室基金项目(X190021TB190) 上海市科学技术委员会项目(1951113200)。
关键词 深度学习 强化学习 多AGENT强化学习 环境的不稳定性 Deep learning Reinforcement learning Multi-agent reinforcement learning Non-stationary of the environment
  • 相关文献

参考文献3

二级参考文献99

  • 1Kloder S,Bhattacharya S,Hutchinson S.A configuration space for permutation-invariant multi-robot formations.In:Proceedings of the IEEE International Conference on Robotics and Automation.IEEE,2004.2746-2751
  • 2Shao J Y,Xie G,Yu J Z,Wang L.A tracking controller for motion coordination of multiple mobile robots.In:Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems.IEEE,2005.1331-1336
  • 3Matsuo Y,Tamura Y.Tree formation multi-robot system for victim search in a devastated indoor space.In:Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems.2004.1071-1076
  • 4Wang Z D,Hirata Y,Kosuge K.Control a rigid caging formation for cooperative object transportation by multiple mobile robots.In:Proceedings of the IEEE International Conference on Robotics and Automation.2004.1580-1585
  • 5Yamakita M,Saito M.Fromation control of SMC with multiple coordiante systems.In:Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems.IEEE,2004.1023-1028
  • 6Chio T S,Tarn T J.Rules and control strategies of multirobot team moving in hierarchical formation.In:Proceedings of the IEEE International Conference on Robotics and Automation.IEEE,2003.2701-2706
  • 7Hidaka Y S,Mourikis A I,Roumeliotis S I.Optimal formations for cooperative localization of mobile robots.In:Proceedings of the IEEE International Conference on Robotics and Automation.IEEE,2005.4137-4142
  • 8Li Y M,Chen X.Stability on multi-robot formation with dynamic interaction topologies.In:Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems.IEEE,2005.1325-1330
  • 9Furukawa T.Time-optimal cooperative control of multiple robot vehicles.In:Proceedings of the IEEE International Conference on Robotics and Automation.IEEE,2003.944-950
  • 10Lawton J,Young B,Beard R.A decentralized approach to elementary formation manoeuvres.In:Proceedings of the IEEE International Conference on Robotics and Automation.IEEE,2000.2728-2733

共引文献151

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部