期刊文献+

基于神经网络的强化学习算法实现倒立摆控制 被引量:7

Balance of an Inverted Pendulum Using Neural Network and Q-Learning
下载PDF
导出
摘要 运用强化学习的方法来对连续的倒立摆系统实现平衡控制是一直以来有待解决的问题。该文将Q学习与神经网络中的BP网络、S激活函数相结合,利用神经网络的泛化性能,设计出一种新的学习控制策略,通过迭代和学习过程,不但能够解决倒立摆系统连续状态空间的输入问题,还成功解决了输出连续动作空间的问题。将此方法运用于连续倒立摆系统的平衡控制中,经过基于实际控制模型的Matlab软件仿真实验,结果显示了这个方法的可行性。该方法进一步提高了强化学习理论在实际控制系统中的应用价值。 How to balance a continuous inverted pendulum using reinforcement learning has been always a problem to be solved. This paper presents a new method combining Q - learning with BP network and sigmoid activation function, using neural network's generalization performance to solve not only the input of a continuous state space but also output as a continuous action space, which has been proved to be applicable by Matlab software simulation with real pendulum system model. This method enhanced the reinforcement learning's applicability in real control system.
作者 张涛 吴汉生
出处 《计算机仿真》 CSCD 2006年第4期298-300,325,共4页 Computer Simulation
关键词 强化学习 神经网络 激活函数 泛化性能 连续动作空间 Reinforcement learning Neural network Activation function Generalization performance Continuous action space
  • 相关文献

参考文献7

  • 1C W Anderson.Learning to control an inverted pendulum using neural networks[J].IEEE Control System Magazine,1989,9(3):31-37.
  • 2A G Barto,R S Sutton,C W Anderson.Neuronlike adaptive elements that can solve difficult learning control problems[J].IEEE Trans.on SMC,1983,13(5):834-846.
  • 3J Peng.Efficient dynamic programming-based learning for control[M].USA:Northeastern University,1993.
  • 4W Charles.Anderson.Q-learning with hidden-unit restarting[M].Advances in Neural Information Processing Systems 5,1992.81-88.
  • 5蒋国飞,吴沧浦.基于Q学习算法和BP神经网络的倒立摆控制[J].自动化学报,1998,24(5):662-666. 被引量:55
  • 6C J C H Watkins and P Dayan.Q-learning.Machine learning[J].May 1992,8(3/4):257-277.
  • 7K Doya.Reinforcement learning in continuous time and space[J].Neural Computation,1999,12:243-269.

二级参考文献1

  • 1Peng J,博士学位论文,1993年

共引文献54

同被引文献45

引证文献7

二级引证文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部