摘要
针对深度确定性策略梯度(DDPG)算法应用于无人有缆遥控水下机器人(ROV)运动控制时存在的坏样本影响学习稳定性、缺少环境探索能力以及学习时间长难收敛等问题,从神经网络结构、噪声引入和融合监督学习3个方面对DDPG算法进行改进,并提出了基于混合神经网络结构和参数噪声的监督式DDPG算法。仿真结果表明,监督式DDPG算法比常规DDPG算法和传统比例-积分-微分(PID)算法更加有效。
When the depth deterministic strategy gradient(DDPG)algorithm is applied to the motion control of unmanned cabled remote-controled underwater robot,several new problems such as the bad samples affect the learning stability,lack the ability to explore the environment are happened,and the learning time is difficult to cover the teaching of the algorithm.Hence,the DDPG algorithm is improved from three aspects:neural network structure,noise introduction and fusion supervised learning,and a supervised DDPG control algorithm based on hybrid neural network structure and parameter noise is proposed.The simulation results show that the improved DDPG algorithm is more effective than the conventional DDPG algorithm and the traditional PID algorithm.
作者
黄兆军
曾明如
HUANG Zhaojun;ZENG Mingru(School of Mechanical and Electrical Engineering,Zhuhai City Polytechnic,Zhuhai 519090,Guangdong,China;School of Information Engineering,Nanchang University,Nanchang 330031,China)
出处
《实验室研究与探索》
CAS
北大核心
2024年第7期34-38,53,共6页
Research and Exploration In Laboratory
基金
2023年广东省普通高校特色创新项目(2023KTSCX330)。
关键词
深度确定性策略梯度算法
混合神经网络
参数噪声
监督学习
无人有缆遥控水下机器人
运动控制
depth deterministic strategy gradient(DDPG)algorithm
hybrid neural network
parametric noise
supervised learning
unmanned cabled remote-controled underwater robot
motion control