摘要
文中提出了一种基于深度强化学习(deep reinforcement learning,DRL)的船舶智能避碰方法.该方法利用D3QN(double deep q-learning network with dueling architecture)算法与船舶领域模型,结合《国际海上避碰规则》(COLREGs)的避碰操作规范设计奖励函数,通过时序差分法实现优先经验回放,构建自主避碰的智能体.通过ROS-gazebo搭建仿真环境,构建神经网络处理环境中的视觉与雷达数据,快速有效地获取环境特征信息.结果表明:对比传统DQN算法,该方法具有更好的决策能力,训练时间更短;在避碰过程中可以对会遇局面做出正确的判断,选择符合COLREGs规范的避碰动作,最终可以准确并及时的避让目标船.
An intelligent ship collision avoidance method based on deep reinforcement learning(DRL) was proposed. In this method, D3QN(Double Deep Q-Learning Network with Dueling Architecture) algorithm and ship domain model were used, and the reward function was designed in combination with the collision avoidance operation specification of International Regulations for Preventing Collisions at Sea(COLREGs), and the priority experience playback was realized by time sequence difference method, so as to construct an agent for autonomous collision avoidance. The simulation environment was built by ROS-gazebo, and the neural network was constructed to process the visual and radar data in the environment, so as to obtain the environmental characteristic information quickly and effectively. The results show that compared with the traditional DQN algorithm, this method has better decision-making ability and shorter training time. In the process of collision avoidance, we can make a correct judgment on the encounter situation, choose the collision avoidance action that conforms to the COLREGs specification, and finally avoid the target ship accurately and in time.
作者
陈立家
孙中泽
黄立文
许毅
李胜为
CHEN Lijia;SUN Zhongze;HUANG Liwen;XU Yi;LI Shengwei(School of Shipping,Wuhan University of Technology,Wuhan 430063,China;Hubei Key Laboratory of Inland Water Navigation Technology,Wuhan 430063,China;School of Computer Science and Technology,Wuhan University of Technology,Wuhan 430063,China)
出处
《武汉理工大学学报(交通科学与工程版)》
2023年第1期191-196,共6页
Journal of Wuhan University of Technology(Transportation Science & Engineering)
基金
国家重点研发计划项目(2018YFC1407400/03,2018YFC0810405)。
关键词
深度强化学习
D3QN
船舶避碰
船舶领域
智能决策
deep reinforcement learning
D3QN
ship collision avoidance
the field of ships
the intelligent decision