摘要
针对全连接神经网络结构下Actor-Critic算法在复杂路径规划环境下训练时间长、不宜收敛且难以处理长动作记忆序列的不足,本文提出了基于双层循环神经网络的水面无人艇(unmanned surface vessel,USV)路径规划算法。该算法的输入并不是单独的一个状态,而是由状态、动作和奖励所组成的具有一定长度的序列(宏动作)。从网络架构上来看,循环神经网络(recurrent neural network,RNN)会记住历史信息,并且使用历史信息影响当前的输入输出,基于RNN结构的双层循环神经网络(double-layer recurrent neural network,DRNN)也具有同样的性质,由于DRNN考虑了一定时间内的环境交互历史,有助于神经网络对于连续动作序列(宏动作)模式的识别。通过仿真实验,在多个地图上与常规的Actor-Critic算法进行对比验证。结果表明:该算法在平均步数、成功率与平均奖励上比Actor-Critic算法有明显提高。
In view of the shortcomings of Actor Critic algorithm based on a fully connected neural network structure in a complex path planning environment,such as long training time,improper convergence and difficulty in handling long action memory sequences,we propose a unmanned surface vessel(USV)path planning algorithm based on two-layer recurrent neural network.The input of the algorithm is not a single state,but a sequence(macro action)of certain length composed of states,actions and rewards.From the perspective of network architecture,RNN will remember historical information,and use historical information to affect current input and output.DRNN based on RNN structure also has the same properties.Because DRNN considers the environmental interaction history in a certain period of time,it is helpful for neural network to recognize continuous action sequence(macro action)patterns.Through simulation,the algorithm is compared with the conventional Actorr-Critic algorithm on several maps.The results show that the Actor Critic algorithm has a significant improvement in average steps,success rate and average reward.
作者
张志鑫
高健
赵大威
ZHANG Zhixin;GAO Jian;ZHAO Dawei(Shenyang Bureau of the Naval Equipment Department,Harbin 150001,China;College of Intelligent Systems Science and Engineering,Harbin Engineering University,Harbin 150001,China)
出处
《应用科技》
CAS
2023年第3期100-107,共8页
Applied Science and Technology
关键词
全连接神经网络
路径规划
循环神经网络
记忆序列
宏动作
双层网络架构
状态
历史信息
fully connected neural network
path planning
recurrent neural network
memory sequences
macro action
double-layer network architecture
state
historical information