摘要
现有的多智能体运动规划任务存在缺乏有效合作方法、通信依赖要求高以及缺乏信息筛选机制等问题。针对这些问题,提出了一种基于意图的多智能体深度强化学习运动规划方法,该方法可以帮助智能体在无需显式通信的条件下无碰撞地到达目标点。首先,将意图概念引入多智能体运动规划问题,将智能体的视觉图像和历史地图相结合以预测智能体的意图,使智能体可以对其他智能体的动作做预判,从而有效协作;其次,设计了一个基于注意力机制的卷积神经网络架构,并利用该网络预测智能体的意图、选择智能体的动作,在筛选出有用的视觉输入信息的同时,减少了多智能体合作对通信的依赖;最后提出了一种基于价值的深度强化学习算法来学习运动规划策略,通过改进目标函数和Q值计算方式使策略更加稳定。在PyBullet仿真平台的6种不同的仿真场景中进行了测试,实验结果表明,相较于其他先进的多智能体运动规划方法,所提方法使多智能体团队的合作效率平均提高了10.74%,具有显著的性能优势。
The challenges of multi-agent motion planning lie in the lack of effective cooperative approaches,high communication dependency requirements,and the lack of information screening mechanisms.To this end,an intention-based multi-agent deep reinforcement learning motion planning method is proposed,which can help agents reach goals while avoiding collisions without explicit communication.Firstly,the concept of intention is introduced into the multi-agent motion planning problem by combining the visual images with the history maps to predict the intentions of agents,so that agents can anticipate the actions of other agents and thus collaborate effectively.Secondly,a convolutional neural network architecture based on attention mechanism is designed.This network architecture can be used to predict the intentions of agents and select the actions of agents,filtering the useful visual input information while reducing the reliance on communication for multi-agent cooperation.Thirdly,a value-based deep reinforcement learning algorithm is proposed to learn the motion planning strategy.By improving the objective function and the calculation of the Q values,the strategy is made more stable.Tested in six different PyBullet simulation scenes,the experimental results demonstrate that the proposed method improves the cooperation efficiency of multi-agent teams by an average of 10.74%with significant performance advantages compared to other advanced multi-agent motion planning methods.
作者
彭滢璇
史殿习
杨焕焕
胡浩萌
杨绍武
PENG Yingxuan;SHI Dianxi;YANG Huanhuan;HU Haomeng;YANG Shaowu(School of Computer Science,National University of Defense Technology,Changsha 410073,China;National Innovation Institute of Defense Technology,Academy of Military Sciences,Beijing 100166,China;Tianjin Artificial Intelligence Innovation Center,Tianjin 300457,China)
出处
《计算机科学》
CSCD
北大核心
2023年第10期156-164,共9页
Computer Science
基金
国家自然科学基金(91948303)。
关键词
意图
注意力机制
多智能体系统
运动规划
深度强化学习
Intention
Attention mechanism
Multi-agent system
Motion planning
Deep reinforcement learning