摘要
合理有效地对移动海洋环境观测平台进行规划,有利于海洋环境观测网络的设计和海洋环境信息的采集。针对庞大的海洋环境,在有限的观测资源下,使用深度强化学习算法对海洋环境观测网络进行规划。针对强化学习算法求解路径规划问题中的离散和连续动作设计问题,分别使用DQN和DDPG两种算法对该问题进行单平台和多平台实验,实验结果表明,使用离散动作的DQN算法的奖赏函数优于使用连续动作的DDPG算法。进一步对两种算法求解的移动海洋观测平台的采样路径结果进行分析,结果显示,使用离散动作的DQN算法的采样结果也更好。实验结果证明,使用离散动作的DQN算法可以最大化对海洋环境中有效资料信息采集,说明了该方法的有效性和可行性。
Reasonable and effective planning method of mobile vehicles for marine environmental observation is beneficial to the design of marine environmental observation network and the collection efficiency of marine environmental information.In view of the vast marine environment and limited observation resources,the deep reinforcement learning algorithm is used to plan the marine environmental observation network.In order to solve the problems in the design of discrete and continuous motion during the path planning,two algorithms,DQN and DDPG,are designed to solve the problem of single platform and multi-platform experiments.The experimental results show that the reward curve of DQN algorithm using discrete motion is better than DDPG algorithm using continuous motion.This paper further analyzes the sampling path results of the mobile vehicles for marine environmental observation,and the results show that the sampling result of DQN algorithm with discrete action is better.The experimental results show that the DQN algorithm using discrete motion can maximize the effective data information collection,which demonstrates effectiveness and feasibility of the method.
作者
赵玉新
杜登辉
成小会
周迪
邓雄
刘延龙
ZHAO Yuxin;DU Denghui;CHENG Xiaohui;ZHOU Di;DENG Xiong;LIU Yanlong(College of Intelligent Systems Science and Engineering,Harbin Engineering University,Harbin 150001,China;China Ship Development and Design Center,Wuhan 430064,China)
出处
《智能系统学报》
CSCD
北大核心
2022年第1期192-200,共9页
CAAI Transactions on Intelligent Systems
基金
国家自然科学基金项目(41676088)
中央高校基本科研业务费项目(3072021CFJ0401).