期刊文献+

基于PPO的机械臂控制研究方法 被引量:1

Research Method of Manipulator Control Based on PPO
下载PDF
导出
摘要 目前应用于机械臂控制中有许多不同的算法,如传统的自适应PD控制、模糊自适应控制等,这些大多需要基于数学模型。也有基于强化学习的控制方法,如:DQN(Deep Q Network)、Sarsa等。但这些强化学习算法在连续高维的动作空间中存在学习效率不高、回报奖励设置困难、控制效果不佳等问题。论文对基于PPO(Proximal Policy Optimization近端策略优化)算法实现任意位置的机械臂抓取应用进行研究,并将实验数据与Actor-Critic(演员-评论家)算法的进行对比,验证了使用PPO算法的控制效果良好,学习效率较高且稳定。 In manipulator control,there are many different control methods,such as traditional adaptive PD control and fuzzy adap⁃tive control,which are mostly based on mathematical models.There are also control methods based on reinforcement learning,such as DQN(Deep Q Network),Sarsa,etc.However,these reinforcement learning algorithms have some problems such as low learning efficiency,difficulty in setting rewards,and poor control effect in the continuous high-dimensional action space.According to Prox⁃imal Policy Optimization algorithm,the application of robot arm grasping at any position is studied,and the experimental data is compared with actor-critic algorithm,which proves that the PPO algorithm has good control effect,high learning efficiency and sta⁃bility.
作者 郭坤 武曲 张义 GUO Kun;WU Qu;ZHANG Yi(School of Information and Control Engineering,Qingdao University of Technology,Qingdao 266520,China)
出处 《电脑知识与技术》 2021年第4期222-225,共4页 Computer Knowledge and Technology
基金 山东省自然科学基金资助项目(ZR2017BF043)。
关键词 强化学习 机械臂 近端策略优化算法 Actor-Critic算法 离线学习 reinforcement learning robot manipulator proximal strategy optimization algorithm Actor-Critic algorithm offline learning
  • 相关文献

参考文献2

二级参考文献22

  • 1李佳宁,易建强,赵冬斌,西广成.移动机器人体系结构研究进展[J].机器人,2003,25(z1):756-760. 被引量:7
  • 2陈春林,陈宗海,周光明.基于多智能体的自主移动机器人混合式体系结构[J].系统工程与电子技术,2004,26(11):1746-1748. 被引量:9
  • 3魏英姿 ,赵明扬 .一种基于强化学习的作业车间动态调度方法[J].自动化学报,2005,31(5):765-771. 被引量:19
  • 4高阳,周如益,王皓,曹志新.平均奖赏强化学习算法研究[J].计算机学报,2007,30(8):1372-1378. 被引量:38
  • 5Nicolescu M N, Mataric M J. A hierarchical architecture for behavior-based robots [ A ]. Proceedings of the first International Joint Conference on Autonomous Agents and Multiagent Systems [ C ].New York, USA: ACM Press, 2002. 227-233.
  • 6McKee G T, Brooks B G. Resource management for networked robotics systems [A]. Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems [C]. Piscataway,USA: IEEE, 1997. 1363- 1368.
  • 7Medeiros A A D. A survey of control architectures for autonomous mobile robots [ J ]. Journal of the Brazilian Computer Society, 1998,4 (3). http ://www. scielo.br/scielo.php? script = sci arttext&pid= S0104-65001998000100004&lng = en&nrm = iso&tlng = en.
  • 8Gat E. Three-layer Architectures in Artificial Intelligence and Mobile Robots[M]. USA: AAAI Press/The MIT Press, 1998. 195-210.
  • 9Gat E. Integrated planning and reacting in a heterogeneous asynchronous architecture for controlling real-world mobile robots [ A ].Proceedings of the Tenth National Conference on Artificial Intelligence[C]. Menlo Park, USA: AAAI, 1992. 809-815.
  • 10Piaggio M. HEIR - a non hierarchical hybrid architecture for intelligent robots[ A]. Proceedings of the 5th International Workshop on Agent Theories, Architectures and Languages[ C]. Berlin, Germany: Spfinger-Verlag, 1998. 243-259.

共引文献486

同被引文献5

引证文献1

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部