期刊文献+

基于强化学习的二级倒立摆控制 被引量:3

Control of the Double Inverted Pendulum Based on Reinforcement Learning
下载PDF
导出
摘要 在模型未知和没有先验经验的条件下,采用一种改进的强化学习算法实现二级倒立摆系统的平衡控制。该学习算法不需要预测和辨识模型,能通过网络自身的联想和记忆,在线寻求最优策略。该学习算法采用基于神经网络的值函数逼近,并用直接梯度和适合度轨迹修正权值,有效实现对连续状态和行为空间任务的控制。计算机仿真证明了该强化学习算法在较短的时间内即可成功地学会控制直线二级倒立摆系统。 An improved reinforcement learning system is proposed to control the inverted pendulum, when the model of the inverted pendulum is not available and the agent has no a priori control knowledge. The learning system does not require prediction model and identification model, and can explore the optimal decision - making on- line by its association and memory. And it adopts neural network, and uses gradient and eligibility traces to update the weights of the networks. It can effectively control the task of continuous states and actions. The simulation results demonstrate that it can learn to control the inverted pendulum system in a short time.
出处 《计算机仿真》 CSCD 2006年第4期305-308,共4页 Computer Simulation
基金 国家自然科学基金资助课题(60375017)
关键词 强化学习 倒立摆 适合度轨迹 Reinforcement learning Inverted pendulum Eligibility traces
  • 相关文献

参考文献9

  • 1A G Barto,S Sutton,C W Anderson.Neuronlike adaptive elements that can solve difficult learning control problems[J].IEEE Trans.on Systems,Man,and Cybernetics,1983,13(5):834-846.
  • 2Charles W Anderson.Learning to Control an Inverted Pendulum Using Neural Networks[J].IEEE Control System Magazine,1989,9(4):31-35.
  • 3H R Berenji and P Khedkar.Learning and tuning fuzzy logic controllers through reinforcements[J].IEEE Transactions on Neural Networks,1992,3(5):724-740.
  • 4Cheng-Jian Lin,Chin-Teng-Lin.Reinforcement Learning for An ART-Based Fuzzy Adaptive Learning Control Network[J].IEEE TRANSCATIONS ON NEURAL NETWORKS.1996,7(3):709-731.
  • 5Jennie Si,Yu-Tsung Wang.On-Line Learning Control by Association and Reinforcement[J].IEEE TRANSACTIONS ON NEURAL NETWORKS,2001,12(2):264-276.
  • 6蒋国飞,吴沧浦.基于Q学习算法和BP神经网络的倒立摆控制[J].自动化学报,1998,24(5):662-666. 被引量:55
  • 7晏雄伟,邓志东,孙增圻.竞争式Takagi-Sugeno模糊再励学习[J].自动化学报,2002,28(6):873-880. 被引量:5
  • 8Danil V Prokorov,Donald C Wunsch.Adaptive Critic Designs[J].IEEE TRANSACTIONS ON NEURAL NETWORKS,1997,8(5):997-1007.
  • 9Simon Haykin.NEURAL NETWORKS A Comprehensive Foundation[M].Beijing:Tsinghua University Press,2001.

二级参考文献1

  • 1Peng J,博士学位论文,1993年

共引文献57

同被引文献31

引证文献3

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部