期刊文献+

基于深度强化学习的机械臂控制快速训练方法 被引量:5

Fast Training Method for Manipulator Control Based on Deep Reinforcement Learning
下载PDF
导出
摘要 人工智能在机器人控制中得到广泛应用,机器人控制算法也逐渐从模型驱动转变为数据驱动。深度强化学习算法可在复杂环境中感知并决策,能够解决高维度和连续状态空间下的机械臂控制问题。然而,目前深度强化学习中数据驱动的训练过程非常依赖计算机GPU算力,且训练时间成本较大。提出基于深度强化学习的先简化模型(2D模型)再复杂模型(3D模型)的机械臂控制快速训练方法。采用深度确定性策略梯度算法代替机械臂传统控制算法中的逆运动学解算方法,直接通过数据驱动的训练过程控制机械臂末端到达目标位置,从而减小训练时间成本。同时,对于状态向量和奖励函数形式,使用不同的设置方式。将最终训练得到的算法模型在真实机械臂上进行实现和验证,结果表明,其控制效果达到了分拣物品的应用要求,相比于直接在3D模型中的训练,能够缩短近52%的平均训练时长。 Artificial Intelligence(AI)is widely used in robot control,and the algorithms of robot control are gradually shifting from model-driven to data-driven.Deep reinforcement learning can perceive and make decisions in complex environments and solve manipulator control problems in high-dimensional and continuous state spaces.The current datadriven training process in deep reinforcement learning relies heavily on GPU computing power and requires a significant amount of training time.To address this problem,this study proposes a fast training method for manipulator control based on deep reinforcement learning of simplified model(2D model)followed by complex model(3D model).A Deep Deterministic Policy Gradient(DDPG)algorithm is used to control the end of the manipulator to reach the target position directly through data-driven training instead of the traditional inverse kinematic solving method,thereby reducing the amount of training time.However,at different settings for the state vector and reward function forms,the final trained algorithm model is implemented and verified on a real manipulator.The results show that the control effect meets the application requirements of sorting items and is able to shorten the average training time by nearly 52%compared with that obtained by training directly in the 3D model.
作者 赵寅甫 冯正勇 ZHAO Yinfu;FENG Zhengyong(School of Electronic Information Engineering,China West Normal University,Nanchong,Sichan 637009,China)
出处 《计算机工程》 CAS CSCD 北大核心 2022年第8期113-120,共8页 Computer Engineering
基金 西华师范大学英才基金(17YC046) 西华师范大学博士科研启动项目“异构无线网络流媒体传输QOE优化”(13E003)。
关键词 机械臂 位置控制 人工智能 深度强化学习 深度确定性策略梯度算法 manipulator position control Artificial Intelligence(AI) deep reinforcement learning Deep Deterministic Policy Gradient(DDPG)algorithm
  • 相关文献

参考文献3

二级参考文献20

  • 1林金星,沈炯,李益国.基于免疫原理的径向基函数网络在线学习算法及其在热工过程大范围工况建模中的应用[J].中国电机工程学报,2006,26(9):14-19. 被引量:15
  • 2盛党红,温秀兰,黄文良.基于免疫进化神经网络的机械手逆运动控制[J].中国机械工程,2007,18(3):282-285. 被引量:4
  • 3Chen Chunlin, Yang Pei, Zhou Xianzhong. A Quantum Inspired Q-learning Algorithm for Indoor Robot Navigation[C]//Proc. of IEEE International Conference on Networking, Sensing and Control. [S. l.]: IEEE Press, 2008: 1599-1603.
  • 4Niu Lianqiang, Li Ling. Application of Reinforcement Learn- ing in Autonomous Navigation for Virtual Vehicles[C]//Proc. of the 9th International Conference on Hybrid Intelligent Systems. [S. l.]: IEEE Press, 2009: 30-32.
  • 5Adiprawita W, Ahmad A S, Sembiring J, et al. Simplified Q-learning for Holonomic Mobile Robot Navigation[C]//Proc. of the 2nd International Conference on Instrumentation, Communications, Information Technology, and Biomedical Engineering. [S. l.]: IEEE Press, 2011: 64-68.
  • 6Gordon S W, Reyes N H, Barczak A. A Hybrid Fuzzy Q- learning Algorithm for Robot Navigation[C]//Proc. of International Joint Conference on Neural Networks. [S. l.]: IEEE Press, 2011: 2625-2631.
  • 7Sepideh V, Reza G, Ataollah E. A Fuzzy Q-learning Approach to Navigation of an Autonomous Robot[C]//Proc. of the 16th International Symposium on Artificial Intelligence and Signal Processing. [S. l.]: IEEE Press, 2012: 520-525.
  • 8Yang Guosheng, Chen Erkui, An Chengwan. Mobile Robot Navigation Using Neural Q-learning[C]//Proc. of International Conference on Machine Learning and Cybernetics. [S. l.]: IEEE Press, 2004: 48-52.
  • 9Watkins C, Dayan P. Q-learning[J]. Machine Learning, 1992, 8(3/4): 279-292.
  • 10Costa E D S, Gouvea M M. Autonomous Navigation in Dynamic Environments with Reinforcement Learning and Heuristic[C]//Proc. of the 9th International Conference on Machine Learning and Applications. Washington D. C., USA: [s. n.], 2010: 12-14.

共引文献20

同被引文献35

引证文献5

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部