期刊文献+

基于改进Q学习的机械臂实时障碍规避方法

Real-time Obstacle Avoidance of Robotic Manipulator Based on Improved Q-learning
下载PDF
导出
摘要 为了提高实时机械臂规避障碍物的适应性,提出一种基于改进Q学习的控制规避方法。首先,利用深度增强学习对机械臂动作给予奖励和惩罚,并通过深度神经网络学习特征表示。然后,采用状态和动作集合以及环境迁移概率矩阵定义马尔科夫决策过程;同时,将归一化优势函数与Q学习算法相结合,以支持在连续空间中定义的机器人系统。实验结果表明:所提方法解决了Q学习收敛速度慢的缺点,实现了高性能机械臂的实时避障,有助于实现人机安全共存。 To improve the adaptability of real-time manipulator to avoid obstacles,a control avoidance method based on improved Q-learning is proposed.Firstly,deep reinforcement learning is used to reward and punish the manip‐ulator action,and the feature representation is learned by deep neural network.Then,the Markov decision process is defined by state and action sets and environment migration probability matrix.At the same time,the normalized domi‐nance function is combined with Q-learning algorithm to support the robot system defined in continuous space.The ex‐perimental results show that the proposed method solves the disadvantage of slow convergence speed of Q-learning,re‐alizes real-time obstacle avoidance of high-performance manipulator,and is conducive to the safe coexistence of man and machine.
作者 吴戴燕 刘世林 Wu Daiyan;Liu Shilin(Department of Mechanical and Electrical Engineering,Anhui Lu'an Technician College,Lu'an 237001,China;School Of Electrical Engineering,Anhui Polytechnic University,Wuhu 241000,China)
出处 《台州学院学报》 2022年第6期13-20,共8页 Journal of Taizhou University
基金 安徽省高校自然科学研究重大项目(KJ2018ZD066) 安徽省高校自然科学研究重点项目(KJ2019A1184)。
关键词 机械臂 马尔科夫决策 深度增强学习 Q学习 归一化优势函数 manipulator Markov decision process deep reinforcement learning Q-learning normalized advantage function
  • 相关文献

参考文献7

二级参考文献57

共引文献71

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部