摘要
为保证海上救援活动的高效性,研究结合深度确定性策略梯度算法(Deep Deterministic Policy Gradient,DDPG)从状态空间、动作空间、奖励函数方面对船只追踪救援目标算法进行设计,并实际应用到无人船追踪救援之中。结果显示DDPG算法的稳定成功率接近100%,性能优异。该设计的算法最终回合累积奖励值能够稳定在10左右,而平均时长则能稳定在80 s左右,能够根据周边环境的状态调整自己的运动策略,满足海上救援活动中的紧迫性要求,能为相关领域的研究提供一条新的思路。
In order to ensure the efficiency of maritime rescue activities,the ship tracking and rescue target algorithm from three aspects:state space,action space and reward function is designed and the unmanned ship tracking and rescue is applied.The results show that the stable success rate of ddpg algorithm is close to 100%and the performance is excellent.The cumulative reward value of the final round of the designed algorithm can be stable at about 10,while the average duration can be stable at about 80 s.It can adjust its movement strategy according to the state of the surrounding environment,meet the urgent requirements in maritime rescue activities,and provide a new idea for research in related fields.
作者
宋雷震
吕东芳
SONG Lei-Zhen;LV Dong-Fang(School of Information Engineering,Huainan Union University,Huainan 232038,Anhui,China)
出处
《黑龙江大学工程学报(中英俄文)》
2024年第1期58-64,共7页
Journal of Engineering of Heilongjiang University
基金
淮南联合大学校级自然科学项目(LZX1902)
安徽省自然科学重点项目(KJ2021A1311)。