摘要
基于未来现代化海上作战背景,提出了利用多智能体深度强化学习方案来完成无人艇群博弈对抗中的协同围捕任务。首先,根据不同的作战模式和应用场景,提出基于分布式执行的多智能体深度确定性策略梯度算法,并对其原理进行了介绍;其次,模拟具体作战场景平台,设计多智能体网络模型、奖励函数机制以及训练策略。实验结果表明,文中方法可以有效应对敌方无人艇的协同围捕决策问题,在不同作战场景下具有较高的效率,为未来复杂作战场景下无人艇智能决策研究提供理论参考价值。
Based on the background of future modern maritime combats,a multi-agent deep reinforcement learning scheme was proposed to complete the cooperative round-up task in the swarm game confrontation of unmanned surface vehicles(USVs).First,based on different combat modes and application scenarios,a multi-agent deep deterministic policy gradient algorithm based on distributed execution was determined,and its principle was introduced.Second,specific combat scenario platforms were simulated,and multi-agent network models,reward function mechanisms,and training strategies were designed.The experimental results show that the method proposed in this article can effectively solve the problem of cooperative round-up decision-making facing USVs from the enemy,and it has high efficiency in different combat scenarios.This work provides theoretical and reference value for the research on intelligent decision-making of USVs in complicated combat scenarios in the future.
作者
于长东
刘新阳
陈聪
刘殿勇
梁霄
YU Changdong;LIU Xinyang;CHEN Cong;LIU Dianyong;LIANG Xiao(College of Artificial Intelligence,Dalian Maritime University,Dalian 116026,China;National Key Laboratory of Autonomous Marine Vehicle Technology Laboratory,Harbin Engineering University,Harbin 150001,China;School of Naval Architecture and Ocean Engineering,Dalian Maritime University,Dalian 116026,China)
出处
《水下无人系统学报》
2024年第1期79-86,共8页
Journal of Unmanned Undersea Systems
基金
国家自然科学基金项目(52271302)
国家基础科研计划项目(JCKY2022410C012)
辽宁省应用基础研究计划项目(2023JH2/101300198)
大连市科技创新基金项目(2021JJ12GX017)
中央高校基本科研业务费专项资金资助(3132023512)
智能海洋航行器技术全国重点实验室支持项目(2024-HYHXQ-WDZC08)。
关键词
无人艇集群
多智能体深度确定性策略梯度算法
深度强化学习
智能决策
博弈对抗
unmanned surface vehicle swarm
multi-agent deep deterministic policy gradient algorithm
deep reinforcement learning
intelligent decision-making
game confrontation