期刊文献+

基于增强学习的无人直升机姿态控制器设计 被引量:1

Design of Attitude Controller for Unmanned Helicopter Based on Reinforcement Learning Algorithm
下载PDF
导出
摘要 自适应启发评价(AHC)增强学习结构分别逼近马尔可夫决策过程的值函数和策略函数,策略梯度增强学习能够将随机不确定的马尔可夫决策过程转换为确定性的马尔可夫决策过程。通过将AHC增强学习和策略梯度增强学习相结合,对PID控制器参数进行在线自适应整定,实现对无人直升机姿态控制性能的在线优化。仿真结果表明,与固定PID参数控制器相比,该算法能在线调整控制器参数,并很好地控制了无人直升机的悬停姿态。 The adaptive heuristic critic(AHC) reinforcement learning frame is approximate of the value function and the policy function of Markov decision process(MDP), the stochastic MDPs can be converted to deterministic MDPs by the policy gradient reinforcement learning. Combined the policy gradient reinforcement learning with the AHC reinforcement learning, the PID parameters was adjusted adaptively on-line, and the on-line optimization of the unmanned helicopter attitude control performance was realized. The simulation results show that this algorithm can adjust PID parameters of the controller on-line and excellently control hovering attitude of unmanned helicopter compared with the controller of fixed PID parameters.
出处 《弹箭与制导学报》 CSCD 北大核心 2008年第2期73-76,共4页 Journal of Projectiles,Rockets,Missiles and Guidance
关键词 无人直升机 增强学习 自适应启发评价 策略梯度 PEGASUS unmanned helicopter reinforcement learning adaptive heuristic critic policy gradient PEGASUS
  • 相关文献

参考文献5

  • 1Andrew Y NG.Autonomous helicopter flight via reinforcement learning[C]// Neural Information Processing System 16,2004.
  • 2Sutton R and A Barto.Reinforcement learning,an introduction[M].M.I.T.Press,1998.
  • 3Williams R J.Simple statistical gradient-following algorithms for connectionist reinforcement learning[J].Machine Learning,1992,8(3):229-258.
  • 4J Baxter,A Trigdell and L Weaver.Knightcap:A chess program that learns by combining TD(λ) with game-tree search[C]// Proc.15th International Conf.on Machine Learning,Morgan Kaufmann,San Francisco,CA,1998.
  • 5Andrew Y.NG and M Jordan.Pegasus:A policy search method for large MDPs and POMDPs approximation[C]// Uncertainty in Artificial on Experiment Robotics,1998.

同被引文献14

  • 1郭红霞,吴捷,王春茹.基于强化学习的模型参考自适应控制[J].控制理论与应用,2005,22(2):291-294. 被引量:5
  • 2王学宁,陈伟,张锰,徐昕,贺汉根.增强学习中的直接策略搜索方法综述[J].智能系统学报,2007,2(1):16-24. 被引量:8
  • 3TAMEI T, SHIBATA T. Fast reinforcement learning for three-dimensional kinetic human-robot cooperation with EMG-to-activation model [J].Advanced Robotics, 2011, 25(5): 563-580.
  • 4HAN Y K, KIMURA H. Motions obtaining of multi- degree-freedom underwater robot by using reinforcement learning algorithms [C]// IEEE Region 10 Annual Inter- national Conference, Proceedings/TENCON. New Jersey: IEEE,2010: 1498-1502.
  • 5PETERS J, SCHAAL S. Natural actor-critic [J]. Neu- rocompnting, 2008, 71(7/8/9) : 1180 - 1190.
  • 6ABBEEL P. Apprenticeship learning and reinforcement learning with application to robotic control [D]. Stan- ford: Department of Computer Science, Stanford Uni- versity, 2008.
  • 7CHU B, PARK J, HONG D. Tunnel ventilation con- troller design using an RLS-based natural actor-critic al- gorithm [J]. International Journal of Precision Engineer- ing and Manufacturing, 2010, 11 (6) : 829 - 838.
  • 8LEWIS F L, VRABIE D. Reinforcement learning and adaptive dynamic programming for feedback control [J]. IEEE Circuits and Systems Magazine, 009, 9(3) : 32 - 50.
  • 9LEWIS F L, VAMVOUDAKIS K G. Optimal adaptive control for unknown systems using output feedback by reinforcement learning methods [C] // Proceedings of 2010 8th IEEE International Conference on Control and Automation. New Jersey: IEEE Computer Society, 2010: 2138-2145.
  • 10BHATNAGAR S, SUTTON R S, GHAVAMZADEH M, et al. Natural actor-critic algorithms [J].Automat- ica, 2009, 45(11): 2471-2482.

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部