基于增强学习规则的倒立摆模糊神经网络控制器被引量：1

Fuzzy Neural Network Controller of Inverted Pendulum with Reinforcement Learn ing Rule

下载PDF

导出

摘要为实现模型未知、初始时有脉冲输入的车上单级倒立摆镇定控制,提出了一种采用增强学习规则训练的模糊神经网络控制器。以神经网络构造基于T-S(Tankagi-Sugeno)规则的模糊控制器;用3层前馈网络组成预测器进入仿真,得到倒立摆状态并计算状态预测值,再将状态和状态预测值组成训练数据对,训练状态预测BP(Backward Propagation)网络;利用增强学习的方法训练模糊控制器,根据神经网络产生的模糊控制量和倒立摆状态预测,做出控制决策。此方法简化了模糊控制部分参数调整,亦可应用于其他无模型控制。实验证明,控制器鲁棒性良好,即使在倒立摆参数变化较大时,控制器仍能维持倒立摆平衡。 A training method for fuzzy neural network controller with reinforcement learning is proposed to maintain the balance of a single inverted pendulum on cart. The inverted pendulum model is unknown and it is disturbed by impulse input at the beginning. Build fuzzy controller based on T-S （Tankagi-Sugeno） rules with neural network; Begin the simulation with a forward network of three layers to obtain states and state-predictions; State-prediction BP （Backward Propagation） network is trained with pairs of state and state prediction; Fuzzy neural network controller is trained using reinforcement learning rule. Control decision is made due to fuzzy control and state prediction. This method simplifies adjustment of fuzzy control parameters, and is suitable to other controls without system model. Experiments showe that the control method has sound robustness. Even if the parameters of inverted pendulum change a lot, the controller could maintain the balance of inverted pendulum however.

作者王红睿赵黎明

机构地区吉林大学通信工程学院

出处《吉林大学学报（信息科学版）》 CAS 2006年第5期561-566,共6页 Journal of Jilin University（Information Science Edition）

关键词增强学习模糊神经网络 T-S模型倒立摆 reinforcement learning fuzzy neural network controller Takagi-Sugeno model inverted pendulum

分类号 TP273.4 [自动化与计算机技术—检测技术与自动化装置]

引文网络
相关文献

参考文献18

1SMIRNOV E Y.Control of Rotational Motion of a Free Solid by Means of Pendulums[J].Mechanics of Solids,1980,15(3):1-5.
2FURUTA K,KAJIWARA H.Digital Control of a Double Inverted Pendulum on an Inclined Rail[ J ].International Journal of Control,1980,32 (5):907-924.
3WATTS J W.Control of an Inverted Pendulum[C] // Proceedings,92nd Annual Conference-American Society for Engineering Education,Engineering Education.Salt Lake City,USA:ASEE,1984:706-710.
4LEFEBVRE S,RICHTER S.Decentralized Variable Structure Control Design for a Two-Pendulum System[ J ].IEEE Transanction on Automatic Control,1983,28 (13):1112-1114.
5YI Jian-qiang,YUBAZAKI N,HIROTA K.A New Fuzzy Controller for Stabilization of Parallel-Type Double Inverted Pendulum System[J].Fuzzy Sets and Systems,2002,126 (1):105-119.
6MOHANLAL P P,KAIMAL M R.Exact Fuzzy Modeling and Optimal Control of the Inverted Pendulum on Cart[C] // 41st IEEE Conference on Decision and Control.Las Vegas:IEEE,2002 (3):3255-3260.
7ANDERSON C W.Learning to Control an Inverted Pendulum Using Neural Networks[ J ].IEEE Control System Magazine,1989,9 (3):31-37.
8蒋国飞,吴沧浦.基于Q学习算法和BP神经网络的倒立摆控制[J].自动化学报,1998,24(5):662-666. 被引量：55
9WU Q.Neural Inverse Modeling and Control of a Base-Eexcited Inverted Pendulum[ J ].Engineering Applications of Artificial Intelligence,2002,15 (3/4):261-272.
10GAO Yang,ER M J.Online Adaptive Fuzzy Neural Identification and Control of a Class of MIMO Nonlinear Systems[ J].IEEE Transactions on Fuzzy Systems,2003,11 (4):462-477.

二级参考文献3

1梁化楼,戴贵亮.人工神经网络与遗传算法的结合：进展及展望[J].电子学报,1995,23(10):194-200. 被引量：71
2Peng J，博士学位论文，1993年
3申铁龙.H∞控制理论与应用[M].北京：清华大学出版社,1996..

共引文献137

1桑保华,薛晓中.基于方案弹道的简易制导炸弹在线神经网络控制设计[J].弹箭与制导学报,2006,26(S1):217-220.
2窦春红,黄明键,王中华,王新江.倒立摆系统及其控制策略研究现状[J].中南大学学报（自然科学版）,2003,34(z1):96-99.
3周济,陈锋.基于强化神经网络的区域协调控制研究[J].电子技术（上海）,2010(9):20-22.
4高志刚,李克鹏,李琦.基于遗传算法和神经网络的倒立摆控制系统[J].江西电力职业技术学院学报,2004,17(3):39-41. 被引量：5
5李敏远,都延丽.基于遗传算法学习的复合神经网络自适应温度控制系统[J].控制理论与应用,2004,21(2):242-246. 被引量：11
6蔡增威,刘德春,张晓华.一种基于鲁棒性设计的一阶倒立摆双闭环控制方法[J].自动化技术与应用,2004,23(4):11-15. 被引量：4
7刘志刚,王建华,耿英三,欧阳森.一种改进的遗传模拟退火算法及其应用[J].系统仿真学报,2004,16(5):1099-1101. 被引量：31
8王广雄,张静,罗晶,许万平.倒立摆的模型和控制问题[J].电机与控制学报,2004,8(3):247-249. 被引量：5
9武利强,韩京清.直线型倒立摆的自抗扰控制设计方案[J].控制理论与应用,2004,21(5):665-669. 被引量：32
10李诚,张明廉.拟人智能控制及鲁棒LQ控制在倒立摆基准问题中的应用[J].控制理论与应用,2004,21(5):670-675. 被引量：1

同被引文献15

1康怀祺,史彩成,何佩琨,李晓琼.Novel Sequential Neural Network Learning Algorithm for Function Approximation[J].Journal of Beijing Institute of Technology,2007,16(2):197-200. 被引量：1
2SUTTON R S, BARTO A G. Reinforcement Learning: An Introductin [ M]. Cambridge, MA: MIT Press, 1998.
3THURN S, MITCHEIL T M. Lifelong Robot Leaning [J]. Robotics and Autonomous System, 1995, 15 (1) : 25-46.
4WATKINS C, DAYAN P. Q-Learning [J]. Machine Learning, 1992, 8 (3/4): 279-292.
5WIDROW B, RUMELHART D E, LEHR M A. The Basic Ideas in Neural Networks [ J]. Communications of the ACM, 1994, 37 (3) : 87-92.
6WANG Xue-song, CHENG Yu-hu, SUN Wei. Q Learning Based on Self-Organizing Fuzzy Radial Basis Function Network [ C] //Thrid International symposium on Neural Networks. Berlin Heidelberg: Springer Verlag, 2006: 607-615.
7PARK J, SANDBERG I W. Universal Approximation Using Radial Basis Functions Networks [ J ]. Neural Computation, 1991, 3 (2): 246-257.
8JUN L. Learning Reactive Behaviors with Constructive Neural Network in Mobile Robotics [ D]. [ S.l. ] : Orebro Studies in Technology, 2006.
9STASTNY J, SKORPIL. Analysis of Algorithms for Radial Basis Function Neural Network [ C ] // IFIP International Federation for Information Processing. [ S. l. ] : Springer, 2007, 245 : 54-62.
10MOODY J, DARKEN C. Learning with Localized Receptive Field [ C ] // In Proc Connection Models Summer School. [ S. l. ] : Morgan Kaufmann, 1988: 133-143.

引证文献1

1吴洪岩,刘淑华,张嵛.基于RBFNN的强化学习在机器人导航中的应用[J].吉林大学学报（信息科学版）,2009,27(2):185-190. 被引量：11

二级引证文献11

1周济,陈锋.基于强化神经网络的区域协调控制研究[J].电子技术（上海）,2010(9):20-22.
2付帅,刘淑华,张之雅,程宇.基于改进人工协调场的多机器人运动编队[J].吉林大学学报（信息科学版）,2010,28(2):153-157. 被引量：3
3郭新辰,吴希,陈书坤,吴春国.基于RBFNN和PSO求解第二类Volterra积分方程的混合方法[J].吉林大学学报（理学版）,2010,48(4):658-661. 被引量：3
4徐明亮,柴志雷,须文波.移动机器人模糊Q-学习沿墙导航[J].电机与控制学报,2010,14(6):83-88. 被引量：7
5但斌斌,王超.重轨矫直参数控制模型的自学习功能研究[J].微型机与应用,2010,29(18):83-85.
6葛锁良,杨旭玮,张亚东.RBF网络自整定PID控制在网络化控制系统中的应用[J].合肥工业大学学报（自然科学版）,2011,34(10):1489-1491. 被引量：7
7徐安,寇英信,于雷,李战武.基于RBF神经网络的Q学习飞行器隐蔽接敌策略[J].系统工程与电子技术,2012,34(1):97-101. 被引量：8
8李艳辉,赵辉,李珊珊.一种新的Q学习算法在机械臂轨迹规划中的应用[J].吉林大学学报（信息科学版）,2013,31(1):90-94. 被引量：2
9盛维涛,张文君,张建兴.基于神经网络的Q学习在Khepera Ⅱ机器人避障中的应用[J].世界科技研究与发展,2013,35(3):374-376.
10张建波,张忠伟,杨洋.改进拉丁超立方蒙特卡洛模拟[J].吉林大学学报（信息科学版）,2018,36(4):452-458. 被引量：11

1阮同军.计算模糊控制量的一种插值方法[J].微机发展,1996,6(5):5-6.
2于哲舟,李江春,周栩,杨斌,杨礼,黄岚.基于视频流的目标检测反馈模型[J].吉林大学学报（工学版）,2009,39(S2):401-405. 被引量：3
3尚志信,周宇,叶庆卫,王晓东.基于粗糙集和BP神经网络算法的网络故障诊断模型研究[J].宁波大学学报（理工版）,2013,26(2):45-48. 被引量：19
4仲志燕.基于粗糙集与遗传算法的机器人喷枪路径规划方法[J].电气自动化,2009,31(2):38-39.
5李勃,李玉惠.一种修正模糊控制量的算法[J].昆明理工大学学报（理工版）,1999,24(2):18-22. 被引量：1
6丁卫,孙建军,等.模糊控制技术在烘丝机控制中的应用[J].机电工程技术,2001,30(7):79-80. 被引量：1
7李敬兆,张崇巍.基于PLC直接查表方式实现的模糊控制器研究[J].电工技术杂志,2001,23(9):18-21. 被引量：38
8欧阳淑丽,符秀辉.避免误操作的遥协作模糊控制方法[J].沈阳化工学院学报,2006,20(4):288-291.
9张仕念,刘雪峰,李新俊,齐俊臣,彭道勇.用粗化算法训练战损规则[J].微计算机应用,2007,28(9):954-956.
10周景洲.神经网络模式编码干扰的判别与改进[J].电机与控制学报,1998,2(1):18-20.

吉林大学学报（信息科学版）

2006年第5期

浏览历史

内容加载中请稍等...

基于增强学习规则的倒立摆模糊神经网络控制器被引量：1

参考文献18

二级参考文献3

共引文献137

同被引文献15

引证文献1

二级引证文献11

相关作者

相关机构

相关主题

浏览历史

基于增强学习规则的倒立摆模糊神经网络控制器 被引量：1

参考文献18

二级参考文献3

共引文献137

同被引文献15

引证文献1

二级引证文献11

相关作者

相关机构

相关主题

浏览历史

基于增强学习规则的倒立摆模糊神经网络控制器被引量：1