期刊文献+

双足机器人步态控制的深度强化学习方法 被引量:8

Deep reinforcement learning method for biped robot gait control
下载PDF
导出
摘要 针对双足机器人行走过程中的步态稳定控制问题,提出一种改进深度Q网络的深度强化学习方法。首先,将深度Q网络算法与确定性策略梯度相结合,提出用修正Double-Q网络优化操作—评论网络的评论网络,给出一种改进的深度Q网络;然后,建立双足机器人连杆模型,在常规的平整路面上将改进的深度Q网络用于作为智能体的双足机器人进行步态控制训练。MATLAB仿真结果表明,与深度Q网络和深度确定性策略梯度算法相比,所提算法有更好的训练速度且其回报曲线具有良好的平滑性。在CPU训练下,经过20 h左右深度强化学习能够完成智能体训练。双足机器人在较小的力矩和长距离下能够稳定快步行走。 Aiming at the stable control of gait during biped robot walking,a deep reinforcement learning method with improved Deep Q-Network(DQN)was proposed.By combining DQN algorithm with a deterministic strategy gradient,an improved DQN learning network was proposed to replace the critic network of actor-critic network with a clipped Double-Q network.A link model of biped robot was established,and the proposed network was used for biped robots gait control training as agents in a conventional flat road environment.MATLAB simulation results showed that compared with DQN and Deep Deterministic Policy Gradient(DDPG)algorithms,the proposed algorithm had a better training speed and its average reward curve had a good smoothness.Under the CPU training conditions,the agent training could be completed after about 20 hours of deep reinforcement learning.The biped robot could achieve stable and fast walking(average speed about 0.5m/s)under the conditions of small torque and long distance(about 5 meters).
作者 冯春 张祎伟 黄成 姜文彪 武之炜 FENG Chun;ZHANG Yiwei;HUANG Cheng;JIANG Wenbiao;WU Zhiwei(School of Aerospace and Mechanical Engineering,Changzhou Institute of Technology,Changzhou 213032,China)
出处 《计算机集成制造系统》 EI CSCD 北大核心 2021年第8期2341-2349,共9页 Computer Integrated Manufacturing Systems
基金 国家自然科学基金青年基金资助项目(11802040) 2018年江苏省青蓝工程优秀青年骨干教师资助项目(A1-5501-19-003)。
关键词 双足机器人 步态控制 深度强化学习 智能体 操作—评论 改进深度Q网络算法 biped robot gait control deep reinforcement learning agent actor-critic improved deep Q-net algorithm
  • 相关文献

参考文献4

二级参考文献16

共引文献47

同被引文献82

引证文献8

二级引证文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部