期刊文献+

基于深度Q网络的多智能体逃逸算法设计

Multi-Agent Evasion Algorithm Design Based on Deep Q-Network
下载PDF
导出
摘要 当前多智能体追逃博弈问题通常在二维平面下展开研究,且逃逸方智能体运动不受约束,同时传统方法在缺乏准确模型时存在设计控制策略困难的问题。针对三维空间中逃逸方智能体运动受约束的情况,提出了一种基于深度Q网络(DQN)的多智能体逃逸算法。该算法采用分布式学习的方法,逃逸方智能体通过对环境的探索学习得到满足期望的逃逸策略。为提高学习效率,根据任务的难易程度将智能体策略学习划分为两个阶段,并设计了相应的奖励函数引导智能体探索满足期望的逃逸策略。仿真结果表明,该算法所得逃逸策略效果稳定,并且具有泛化能力,在改变一定的初始位置条件后,逃逸方智能体也可成功逃逸。 At present, the problem of multi-agent pursuit-evasion game is usually studied in the two-dimensional plane, and the movement of the evader is not constrained. At the same time, one problem is that it is difficult for traditional methods to design control strategy without accurate model. Therefore, this paper proposes a multi-agent evasion algorithm based on deep Q-network when the motion of evader is constrained in three-dimensional space. The proposed algorithm is a decentralized algorithm, and the evader obtains the desired evasive strategy by exploring and learning the environment. In order to improve the learning efficiency, the agent strategy learning is divided into two stages according to the difficulty of the task, and the corresponding reward function is designed to guide the agent to explore the desired evasive strategy. The simulation results show that the effect of the evasive strategy obtained by the algorithm is stable, and the algorithm has generalization ability, and the evader can successfully evade after changing certain initial position conditions.
作者 闫博为 杜润乐 班晓军 周荻 YAN Bo-wei;DU Run-le;BAN Xiao-jun;ZHOU Di(The School of Astronautics,University of the Harbin Institute of Technology,Harbin 150000,China;National Key Laboratory of Science and Technology on Test Physics and Numerical Mathematics,Beijing 100076,China)
出处 《导航定位与授时》 CSCD 2022年第6期40-47,共8页 Navigation Positioning and Timing
关键词 逃逸算法 深度强化学习 多智能体 深度Q网络 Evasion algorithm Deep reinforcement learning Multi-agent Deep Q-Network
  • 相关文献

参考文献4

二级参考文献21

  • 1张克,刘永才,关世义.多智能体系统在导弹攻防对抗仿真中应用的可行性研究[J].战术导弹技术,2001(6):59-65. 被引量:8
  • 2赵秀娜,袁泉,马宏绪,黄茜薇.机动弹头中段突防姿态的搜索算法研究[J].航天控制,2007,25(4):13-16. 被引量:3
  • 3Lucian B,Robert B,Bart D S.A comprehensive survey of multiagent reinforcement learning[J].IEEE Transactions on Systems,Man,and Cybernetics-Part C:Applications and Reviews,2008,38(2):156-172.
  • 4Shivaram K,Yaxin L,Peter S.Half field offense in robocup soccer:A multiagent reinforcement learning case study[J].Lecture Notes in Computer Science,2007,4434:72-85.
  • 5Littman M L.Markov games as a framework for multiagent learning[C]//Proceedings of the 11th International Conference on Machine Learning,1994:157-163.
  • 6Hu J L,Wellman M P.Nash Q-learning for general-sum stochastic games[J].Journal of Machine Learning Research,2003,4(6):1039-1069.
  • 7Wunder M,Littman M,Babes M.Classes of multiagent Q-learning dynamics with ε-greedy exploration[R].New Jersey:Rutgers University DCS-tr-670,2010.
  • 8Kim H E,Ahn H S.Convergence of multiagent Q-learning:Multi action replay process approach[C]//Proceedings of the IEEE International Symposium on Intelligent Control Part of Multi-Conference on Systems and Control,2010:789-794.
  • 9Jens K,Jan P.Imitation and reinforcement learning practical algorithms for motor primitives in robotics[J].IEEE Robotics and Automation,2010,17(2):55-62.
  • 10Michelle M,Marcus G.Reinforcement learning in first person shooter games[J].IEEE Transactions on Computational Intelligence and AI in Games,2011,3(1):43-56.

共引文献76

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部