期刊文献+

深度强化学习TD3算法在倒立摆系统中的应用 被引量:3

Research on Application of Deep Reinforcement Learning TD3 Algorithm in Inverted Pendulum System
下载PDF
导出
摘要 针对现有控制算法在倒立摆系统控制中存在的局限性,融合强化学习和深度学习方法,提出一种基于双延迟深度确定性策略梯度(TD3)的倒立摆端到端控制方法。首先,利用倒立摆动力学模型搭建虚拟仿真环境,设计稀疏奖励函数;其次,通过深度神经网络构建从倒立摆状态输入到执行动作输出的端到端控制模型,分析倒立摆特性,来确定神经网络结构和参数;最后,将虚拟仿真环境中生成的模型移植到倒立摆实物平台并进行优化。试验结果表明:该方法生成的模型能够有效地建立倒立摆状态和执行动作之间的映射关系,在运动控制中具有一定的借鉴意义。 Aiming at the limitations of existing control algorithms in the control of inverted pendulum systems, an end-to-end control method for inverted pendulums based on the dual-delay depth deterministic strategy gradient(TD3) is proposed combining reinforcement learning and deep learning. First, the inverted pendulum dynamic model is used to build a virtual simulation environment, and a sparse reward function is designed. Then, a deep neural network is used to build an end-to-end control model from the inverted pendulum state input to the execution action output, the characteristics of the inverted pendulum are analyzed, and the neural network structure and parameters are determined. Finally, the model generated in the virtual simulation environment is transplanted to the inverted pendulum physical platform for optimization. Experiment results show that the model generated by this method can effectively establish the mapping relationship between the state of the inverted pendulum and the execution of the action, which has certain reference significance in motion control.
作者 何卫东 刘小臣 张迎辉 姚世选 HE Weidong;LIU Xiaochen;ZHANG Yinghui;YAO Shixuan(School of Mechanical Engineering,Dalian Jiaotong University,Dalian 116028,China;College of Software,Dalian Foreign Language University,Dalian 116044,China)
出处 《大连交通大学学报》 CAS 2023年第1期38-44,共7页 Journal of Dalian Jiaotong University
关键词 深度强化学习 倒立摆控制 TD3 端到端 稀疏奖励函数 deep reinforcement learning inverted pendulum control TD3 end-to-end sparse reward function
  • 相关文献

参考文献6

二级参考文献33

共引文献109

同被引文献24

引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部