摘要
为实现水面无人艇(unmanned surface vessel,USV)在未知环境下的自主避障航行,提出一种基于深度Q网络的USV避障路径规划算法。该算法将深度学习应用到Q学习算法中,利用深度神经网络估计Q函数,有效解决传统Q学习算法在复杂水域环境的路径规划中容易产生维数灾难的问题。通过训练模型可有效地建立感知(输入)与决策(输出)之间的映射关系。依据此映射关系,USV在每个决策周期选择Q值最大的动作执行,从而能够成功避开障碍物并规划出最优路线。仿真结果表明,在迭代训练8000次时,平均损失函数能够较好地收敛,这证明USV有效学习到了如何避开障碍物并规划出最优路线。该方法是一种不依赖模型的端到端路径规划算法。
In order to realize the autonomous obstacle avoidance navigation of unmanned surface vessels(USVs)in unknown environment,a USV obstacle avoidance path planning algorithm based on the deep Q network is proposed.In this algorithm,the deep learning is applied to the Q-learning algorithm,and the Q function is estimated by the deep neural network,which effectively solves the problem of dimension disasters in the path planning of complex waters environment caused by the traditional Q-learning algorithm.The mapping relationship between the perception(input)and the decision(output)can be established effectively by the trained model.According to the mapping relationship,a USV chooses the action with the largest Q value in each decision cycle,so that it can successfully avoid obstacles and plan the optimal route.The simulation results show that,the average loss function can converge well through the iteration training of 8000 times,which proves that the USV has learned how to avoid obstacles and plan the optimal route effectively.This method is an end-to-end path planning algorithm which does not depend on models.
作者
随博文
黄志坚
姜宝祥
郑欢
温家一
SUI Bowen;HUANG Zhijian;JIANG Baoxiang;ZHENG Huan;WEN Jiayi(Merchant Marine College, Shanghai Maritime University, Shanghai 201306, China)
出处
《上海海事大学学报》
北大核心
2020年第3期1-5,116,共6页
Journal of Shanghai Maritime University
基金
国家自然科学基金(61403250)。
关键词
水面无人艇(USV)
自主避障
路径规划
深度Q网络
卷积神经网络
强化学习
unmanned surface vessel(USV)
autonomous obstacle avoidance
path planning
deep Q network
convolutional neural network
reinforcement learning