摘要
为解决无人驾驶船舶的多船避碰问题,结合船舶领域知识、国际海上避碰规则(COLREGs)及船舶操纵特性,提出一种基于深度确定性策略梯度(DDPG)算法的多船会遇避碰行为决策方法。采用门控循环单元(GRU)构建神经网络模型,并进行层归一化处理,可有效处理高维观测数据,提高了行为决策的效率。本文设计的奖励函数符合国际海上避碰规则,并考虑了尽量使用小舵角进行避让的船舶操纵习惯。多船会遇的仿真实验验证了本文避碰决策方法在灵活性和有效性方面的优势。
To solve the problem of multi⁃vessel collision avoid⁃ance of unmanned ships,a multi⁃vessel collision avoidance behavior decision⁃making method based on the deep determin⁃istic policy gradient(DDPG)algorithm was proposed,which combining knowledge of ship domain,international regulations for preventing collisions at sea(COLREGs),and ship ma⁃neuvering characteristics.The gated recurrent unit(GRU)was used to construct a neural network model and performs layer normalization,which can effectively process high⁃dimensional observation data and improve the efficiency of behavior⁃al decision⁃making methods.The reward function designed in this paper conformed to the GOLREGs,while considering the ship maneuvering habit of using small rudder angles as much as possible for avoidance.The simulation experiments of mul⁃tiple⁃ship encounters verified the advantages of the collision a⁃voidance decision⁃making method in terms of flexibility and effectiveness in this paper.
作者
关巍
罗文哲
崔哲闻
GUAN Wei;LUO Wenzhe;CUI Zhewen(Navigation College,Dalian Maritime University,Dalian 116026,China)
出处
《大连海事大学学报》
CAS
CSCD
北大核心
2024年第1期11-19,共9页
Journal of Dalian Maritime University
基金
国家自然科学基金资助项目(52171342)。
关键词
多船避碰
行为决策
国际海上避碰规则(COL⁃REGs)
深度强化学习
门控循环单元(GRU)
multi⁃ship collision avoidance
behavioral deci⁃sion⁃making
international regulations for preventing collisions at sea(COLREGs)
deep reinforcement learning
gated recurrent unit(GRU)