期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
Off-policy correction algorithm for double Q network based on deep reinforcement learning
1
作者 Qingbo Zhang manlu liu +2 位作者 Heng Wang Weimin Qian Xinglang Zhang 《IET Cyber-Systems and Robotics》 EI 2023年第4期16-26,共11页
A deep reinforcement learning(DRL)method based on the deep deterministic policy gradient(DDPG)algorithm is proposed to address the problems of a mismatch between the needed training samples and the actual training sam... A deep reinforcement learning(DRL)method based on the deep deterministic policy gradient(DDPG)algorithm is proposed to address the problems of a mismatch between the needed training samples and the actual training samples during the training of in-telligence,the overestimation and underestimation of the existence of Q-values,and the insufficient dynamism of the intelligence policy exploration.This method introduces the Actor-Critic Off-Policy Correction(AC-Off-POC)reinforcement learning framework and an improved double Q-value learning method,which enables the value function network in the target task to provide a more accurate evaluation of the policy network and converge to the optimal policy more quickly and stably to obtain higher value returns.The method is applied to multiple MuJoCo tasks on the Open AI Gym simulation platform.The experimental results show that it is better than the DDPG algorithm based solely on the different policy correction framework(AC-Off-POC)and the conventional DRL algorithm.The value of returns and stability of the double-Q-network off-policy correction algorithm for the deep deterministic policy gradient(DCAOP-DDPG)pro-posed by the authors are significantly higher than those of other DRL algorithms. 展开更多
关键词 neural network Q-LEARNING reinforcement learning
原文传递
Path planning of hyper‐redundant manipulators for narrow spaces 被引量:1
2
作者 Haoxiang Su manlu liu +3 位作者 Hongwei liu Jianwen Huo Songlin Gou Qing Su 《IET Cyber-Systems and Robotics》 EI 2022年第3期251-263,共13页
Compared with the traditional manipulator,the hyper‐redundant manipulator has the advantage of high flexibility,which is particularly suitable for all kinds of complex working environments.However,the complex space e... Compared with the traditional manipulator,the hyper‐redundant manipulator has the advantage of high flexibility,which is particularly suitable for all kinds of complex working environments.However,the complex space environment requires the hyper‐redundant manipulator to have stronger obstacle avoidance ability and adaptability.In order to solve the problems of a large amount of calculation and poor obstacle avoidance effects in the path planning of the hyper‐redundant manipulator,this paper introduces the‘backbone curve’approach,which transforms the problem of solving joint path points into the behaviour of determining the backbone curve.After the backbone curve approach is used to design the curve that meets the requirements of obstacle avoidance and the end pose,the least squares fitting and the improved space joint fitting are used to match the plane curve and the space curve respectively,and the angle value of each joint of the manipulator is limited by the algorithm.Furthermore,a fusion obstacle avoidance algorithm is proposed to obtain the joint path points of the hyper‐redundant manipulator.Compared with the classic Jacobian iteration method,this method can avoid obstacles better,has the advantages of simple calculation,high efficiency,and can fully reflect the geometric characteristics of the manipulator.Simulation experiments have proven the feasibility of the algorithm. 展开更多
关键词 CONVERGENCE dexterous manipulators geometric algebra motion planning
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部