期刊文献+

基于博弈论及Q学习的多Agent协作追捕算法 被引量:5

Multi-agent collaborative pursuit algorithm based on game theory and Q-learning
下载PDF
导出
摘要 多Agent协作追捕问题是多Agent协调与协作研究中的一个典型问题。针对具有学习能力的单逃跑者追捕问题,提出了一种基于博弈论及Q学习的多Agent协作追捕算法。首先,建立协作追捕团队,并构建协作追捕的博弈模型;其次,通过对逃跑者策略选择的学习,建立逃跑者有限的Step-T累积奖赏的运动轨迹,并把运动轨迹调整到追捕者的策略集中;最后,求解协作追捕博弈得到Nash均衡解,每个Agent执行均衡策略完成追捕任务。同时,针对在求解中可能存在多个均衡解的问题,加入了虚拟行动行为选择算法来选择最优的均衡策略。C#仿真实验表明,所提算法能够有效地解决障碍环境中单个具有学习能力的逃跑者的追捕问题,实验数据对比分析表明该算法在同等条件下的追捕效率要优于纯博弈或纯学习的追捕算法。 The multi-agent collaborative pursuit problem is a typical problem in the multi-agent coordination and collaboration research.Aiming at the pursuit problem of single escaper with learning ability,a multi-agent collaborative pursuit algorithm based on game theory and Q-learning was proposed.Firstly,a cooperative pursuit team was established and a game model of cooperative pursuit was built.Secondly,through the learning of the escaper’s strategy choices,the trajectory of the escaper’s limited Step-T cumulative reward was established,and the trajectory was adjusted to the pursuer’s strategy set.Finally,the Nash equilibrium solution was obtained by solving the cooperative pursuit game,and the equilibrium strategy was executed by each agent to complete the pursuit task.At the same time,in order to solve the problem that there may be multiple equilibrium solutions,the virtual action behavior selection algorithm was added to select the optimal equilibrium strategy.C#simulation experiments show that,the proposed algorithm can effectively solve the pursuit problem of single escaper with learning ability in the obstacle environment,and the comparative analysis of experimental data shows that the pursuit efficiency of the algorithm under the same conditions is better than that of pure game or pure learning.
作者 郑延斌 樊文鑫 韩梦云 陶雪丽 ZHENG Yanbin;FAN Wenxin;HAN Mengyun;TAO Xueli(College of Computer and Information Engineering,Henan Normal University,Xinxiang Henan 453007,China;Henan Engineering Laboratory of Smart Commerce and Internet of Things Technologies,Xinxiang Henan 453007,China)
出处 《计算机应用》 CSCD 北大核心 2020年第6期1613-1620,共8页 journal of Computer Applications
基金 国家自然科学基金资助项目(U1604156) 河南师范大学青年基金资助项目(2017QK20)。
关键词 多AGENT 协作追捕 博弈论 Q学习 强化学习 multi-agent collaborative pursuit game theory Q-learning reinforcement learning
  • 相关文献

参考文献5

二级参考文献42

  • 1周浦城,洪炳镕,王月海.动态环境下多机器人合作追捕研究[J].机器人,2005,27(4):289-295. 被引量:16
  • 2李淑琴,王欢,李伟,杨静宇.基于动态角色的多移动目标围捕问题算法研究[J].系统仿真学报,2006,18(2):362-365. 被引量:12
  • 3周浦城,洪炳镕,黄庆成.一种新颖的多agent强化学习方法[J].电子学报,2006,34(8):1488-1491. 被引量:8
  • 4Yamaguchi H. A cooperative hunting behavior by mobile-robot troops[ J]. International Journal of Robotics Research, 1999, 8 (8) :931 -940.
  • 5Kopparty S, Ravishankar C V.A framework for pursuit evasion games in Rn [ J ]. Information Processing Letters 2005,96 ( 3 ) : 114 -122.
  • 6Kok J R, Vlassis N. Sparse Cooperative Q-learning[ A ]. Pro-ceedings of the 21st International Conference on Machine Learning[ C]. Banff, Alberta, Canada: Mrr Press, 2OM. 61 -68.
  • 7Yuko Ishiwaka, Takamasa Sato, Yukinori Kakazu. An approach to the pursuit problem on a heterogeneous mulfiagent system using reinforcement learning [ J ]. Robotics and Autonomous Systems, 2003,3(4) : 245 -256.
  • 8Vidal R, Shakemia O, Kim H J, Shim D H, Sastry S. Proba-bifistic pursuit-evasion games: theory, implementation and ex-perimental evaluation [ J ]. IEEE Transactions on Robotics and Automation, 2002,18 (5) : 662 -669.
  • 9Gdnton C. A Tested for Investigating Agent Effectiveness in a Multiagent Pursuit Game[ D]. Victoria, Australia: The Universi-ty of Melbourne, 1996.
  • 10Luca Schenato, Songhwai Oh , Shankar Sastry, Prasanta Bose. Swarm coordination for pursuit evasion games using sensor net-works[ A]. Proceedings of the 2005 1EEE International Confer-ence on Robotics and Automation[ C]. Barcelona, Spain: IEEE. Press,2005.2493-2498.

共引文献50

同被引文献45

引证文献5

二级引证文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部