期刊文献+

连通约束下的经验增强多智能体路径规划算法

Multi-agent Path Planning Algorithm based on Enhanced Experience Algorithm under Connectivity Constraints
下载PDF
导出
摘要 避障与通信连通约束下的多智能体路径规划问题是多智能体领域的一个热点问题。本文研究了智能体从起始区域出发,在仅指定目标区域而不指定每个智能体具体目标点的场景,多智能体需要在保持避障约束与通信连通约束下找到最优的路径和各自的目标点。针对传统深度Q网络和Double-DQN算法存在收敛效率低、因探索导致安全性低的问题,提出一个基于经验增强的强化学习的多智能体路径规划算法。首先,设计一种奖励函数构造方法,综合考虑了多智能体路径规划任务中避障约束与通信连通约束的影响;其次,智能体在行动过程中记录历史经验,对各自的历史经验进行评估;然后,在动作选择策略中加入历史经验的指导,使用经验增强的动作选择策略的智能体有更高概率前往高价值状态;最后,进行实验验证。结果显示,该方法相比于传统深度Q网络与Double-DQN算法可以更快找到一个最优解,收敛效率提升了41%和11%,避障指标提升了10%和3%,连通指标提升了3%与2%。 The multi-agent path planning problem under the constraints of obstacle avoidance and connectivity of communication keeping is a hot issue in the multi-agent field.This paper studies the agents start from the starting area and only specify the target area without specifying the specific target point of each agent.Multiple agents need to find the optimal path and their own target point under the constraint of obstacle avoidance and communication connectivity.Aiming at the problems of low convergence efficiency and poor security of traditional Deep Q-Network(DQN)and Double-DQN(DDQN),a multi-agent path planning algorithm based on experience-enhanced DQN(EH-DQN)is proposed.First,we design a reward function construction method which comprehensively considers the influence of obstacle avoidance constraints and connectivity constraints in the multi-agent path planning task;secondly,the agent performs actions and records historical experience,and evaluates the historical experience of the agent;then,adding the guidance of historical experience to the action selection strategy,and the agent using the experience-enhanced action selection strategy has a higher probability to go to a high-value state.Finally,experiments show that this method can find a feasible optimal solution faster than the traditional DQN and DDQN.Compared with the traditional DQN and DDQN,the convergence speed is increased by 41%and 12%respectively.The obstacle avoidance index has increased by 10%and 3%and the connectivity indicators have increased by 3%and 2%respectively.
作者 张李 ZHANG Li(College of Computer and Cyber Security,Fujian Normal University,Fuzhou,China,350117)
出处 《福建电脑》 2023年第3期1-8,共8页 Journal of Fujian Computer
基金 国家自然科学基金(No.61873033) 福建省自然科学基金重点项目(No.2020H0012)资助。
关键词 强化学习 多智能体系统 路径规划 动作选择策略 经验增强 Reinforcement Learning Multi-agent System Path Planning Action Selection Strategy Experience Enhance
  • 相关文献

参考文献1

二级参考文献8

共引文献486

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部