期刊文献+

基于深度强化学习的异构智能体编队避障控制方法

Heterogeneous agent formation obstacle avoidance control method based on deep reinforcement learning
下载PDF
导出
摘要 针对在编队避障控制中智能体个体的异构性和多任务的复杂性问题,提出一种基于深度强化学习的异构智能体编队避障控制方法。首先,为了克服智能体个体的异构性,详细描述了领航者和跟随者智能体各自采用的局部观测表示;其次,根据智能体的相应任务,设计了编队、避障和导航三种复合的综合奖励函数,以实现更加灵活和高效的编队避障控制;最后,设计了融合注意力机制的行动者-评论家网络,用于联合训练领航者和跟随者的运动策略,从而使智能体能够逐步优化综合策略,以应对复杂的交互信息。数值仿真实验结果显示,文中提出的方法使得智能体能有效完成各自的任务,与其他强化学习算法相比,能够更迅速、更精确地使智能体学习到最优的运动策略,并在未来应用于复杂环境中,具有潜在的前景和价值。 In view of the heterogeneity of individual agents and the complexity of multi-tasks in formation obstacle avoidance control,a heterogeneous agent formation obstacle avoidance control method based on deep reinforcement learning is proposed.The local observation representations adopted by the leader and follower agents are described in detail in order to overcome the heterogeneity of individual agents.According to the corresponding tasks of the agents,three composite reward functions of formation,obstacle avoidance and navigation are designed to achieve more flexible and efficient formation obstacle avoidance control.An actor-critic network integrating attention mechanism is designed for joint training of the motion strategies of the leader and follower,so that the agents can gradually optimize the comprehensive strategy to cope with complex interaction information.Numerical simulation results show that the proposed method enables the agents to complete their respective tasks effectively.In comparison with the other reinforcement learning algorithms,the proposed method can make the agents learn the optimal motion strategy more quickly and accurately,so it has potential prospects and value for future applications in complex environments.
作者 虞逸凡 岳圣智 徐俊 宋婧菡 林远山 YU Yifan;YUE Shengzhi;XU Jun;SONG Jinghan;LIN Yuanshan(School of Information Science&Engineering,Dalian Ocean University,Dalian 116023,China;Key Laboratory of Environment Controlled Aquaculture,Ministry of Education,Dalian Ocean University,Dalian 116023,China)
出处 《现代电子技术》 北大核心 2024年第15期102-108,共7页 Modern Electronics Technique
基金 广西重点研发计划(桂科AB23075150) 设施渔业教育部重点实验室开放课题(202219) 辽宁省应用基础计划项目(2022JH2/101300187) 2023中央财政对辽宁渔业补助项目。
关键词 编队避障控制 异构性 多任务 领航者-跟随者 深度强化学习 综合奖励函数 注意力机制 运动策略 formation obstacle avoidance control heterogeneity multi-tasking leader-follower deep reinforcement learning composite reward function attention mechanism motion strategy
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部