摘要
在合作环境的多智能体系统中,有效地进行显式通信可以提高智能体间的协作能力。但现有的通信策略往往将智能体的局部观测值直接作为通信内容,且通信对象通常是按照某种拓扑结构固定的,其结果一方面难以适应任务和环境变化而引起通信过程的不确定性;另一方面,通信对象和通信信息缺乏侧重点会造成通信带宽的资源浪费导致通信效率较低。针对上述多智能体通信协同问题,提出一种融合深度强化学习和信息论的方法来实现多智能体自适应显式通信机制。所提方法采用先验网络使智能体动态地选择通信的对象;再利用互信息的约束和信息瓶颈理论有效过滤冗余信息;最后,汇总自身及接收到的信息推理出更有效的通信内容。通过合作导航和交通路口实验环境证明了该方法对比其他方法提高了多智能体系统的交互效率和合作稳定性。
Effective explicit communication among agents in a multi-agent system can increase their capacity for cooperation. However, existing communication strategies typically use the agents' local observations as the communication content directly, and the communication objects are usually fixed with a certain topology structure. On the one hand, these strategies are difficult to adapt to changes in tasks and environments, which causes uncertainty in the communication process. On the other hand, the communication objects and contents lack focus, resulting in some resource waste and lower communication effectiveness. To address the issues above, this paper proposes an approach that integrates deep reinforcement learning and information theory to realize multi-agent adaptive communication mechanism. The approach uses a prior network to allow the agent to dynamically choose the object, then utilizes the constraints of mutual information and the information bottleneck theory to effectively filter redundant information. Finally, the agent summarizes its own and received information to extract more effective information. The method proposed is demonstrated to improve the stability and interaction efficiency of multi-agent systems compared to other methods through cooperative navigation and traffic junction environments.
作者
高兵
张哲婕
邹启杰
刘治国
赵锡玲
GAO Bing;ZHANG Zhejie;ZOU Qijie;LIU Zhiguo;ZHAO Xiling(School of Information Engineering Faculty,Dalian University,Dalian 116622,China;Key Laboratory of Communication&Network,Dalian University,Dalian 116622,China)
出处
《航空学报》
EI
CAS
CSCD
北大核心
2024年第18期221-233,共13页
Acta Aeronautica et Astronautica Sinica
基金
国家自然科学基金(61673084)
2021年辽宁省教育厅项目(LJKZ1180)。
关键词
多智能体深度强化学习
互信息
显式通信
信息瓶颈理论
合作环境
multi-agent deep reinforcement learning
mutual information
explicit communication
information bottle-neck
cooperationenvironment