期刊文献+

基于多个并行CMAC神经网络的强化学习方法 被引量:2

Reinforcement Learning Based on Many Parallel CMAC Neural Networks
下载PDF
导出
摘要 为解决标准Q学习算法收敛速度慢的问题,提出一种基于多个并行小脑模型(Cerebellar Model Articulation Controller:CMAC)神经网络的强化学习方法。该方法通过对输入状态变量进行分割,在不改变状态分辨率的前提下,降低每个状态变量的量化级数,有效减少CMAC的存储空间,将之与Q学习方法相结合,其输出用于逼近状态变量的Q值,从而提高了Q学习方法的学习速度和控制精度,并实现了连续状态的泛化。将该方法用于直线倒立摆的平衡控制中,仿真结果表明了其正确性和有效性。 To solve the problem of the slow convergent rate of standard Q-learning, a reinforcement learning algorithm based on many parallel Cerebellar Model Articulation Controller (CMAC) neural networks was proposed. The input state variables were divided to decrease the grades of quantization without changing the resolution. Therefore, the storage spaces of CMAC were reduced effectively, and the outputs of CMAC with lower storage spaces were used to approximate the Q-functions of the corresponding input state variables by integrating CMAC with Q-learning method. So, the learning rate and control precision of Q-algorithm were improved simultaneity, and the generalization of continuous state variables was realized. The method was applied to control the balance of inverted pendulum, and the simulation results show its correctness and efficiency.
出处 《系统仿真学报》 EI CAS CSCD 北大核心 2008年第24期6683-6685,6690,共4页 Journal of System Simulation
基金 国家自然科学基金(60674066 3067054) 科博启动基金(52002011200702)
关键词 强化学习 小脑模型 神经网络 收敛性 倒立摆 reinforcement learning CMAC neural network convergence inverted pendulum
  • 相关文献

参考文献6

  • 1Barto A G, Suton S, Anderson C W. Neuron like adaptive elements that can solve difficult learning control problems [J]. IEEE Trans on Systems Man and Cybernetics (ISBN: 0-262-01097-6), 1983, 13(5): 834-846.
  • 2Anderson C W. Learning to control an inverted pendulum using neural networks [J]. IEEE Control System Magazine (S0272-1708), 1989, 9(4): 31-35.
  • 3J C H Watkins. Learning from Delayed Rewards PHD [D]. England: University of Cambridge, 1989.
  • 4蒋国飞,吴沧浦.基于Q学习算法和BP神经网络的倒立摆控制[J].自动化学报,1998,24(5):662-666. 被引量:55
  • 5Si J N, Wang Y T. On-line learning control by association and reinforcement [J]. IEEE Transactions on Neural Net-works ($1045- 9227), 2001, 12(2): 264-276.
  • 6Lin C T, Lee C S G. Reinforcement structure / parameter learning for neural network-based fuzzy logic control systems [J]. IEEE Transaction on Fuzzy systems (S1063-6706), 1994, 2(1): 46-63.

二级参考文献1

  • 1Peng J,博士学位论文,1993年

共引文献54

同被引文献12

引证文献2

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部