期刊文献+

Fuzzy Q-learning in continuous state and action space

Fuzzy Q-learning in continuous state and action space
原文传递
导出
摘要 An adaptive fuzzy Q-leaming (AFQL) based on fuzzy inference systems (FIS) is proposed. The FIS realized by a normalized radial basis function (NRBF) neural network is used to approach Q-value function, whose input is composed of state and action. The rules of FIS are created incrementally according to the novelty of each element of the pair of state-action. Moreover the premise part and consequent part of the FIS are updated using extended Kalman filter (EKF). The action that impacts on environment is the one with maximum output of FIS in the current state and generated through optimization method. Simulation results in the wall-following task of mobile robots and the inverted pendulum balancing problem demonstrate that the superiority and applicability of the proposed AFQL method. An adaptive fuzzy Q-leaming (AFQL) based on fuzzy inference systems (FIS) is proposed. The FIS realized by a normalized radial basis function (NRBF) neural network is used to approach Q-value function, whose input is composed of state and action. The rules of FIS are created incrementally according to the novelty of each element of the pair of state-action. Moreover the premise part and consequent part of the FIS are updated using extended Kalman filter (EKF). The action that impacts on environment is the one with maximum output of FIS in the current state and generated through optimization method. Simulation results in the wall-following task of mobile robots and the inverted pendulum balancing problem demonstrate that the superiority and applicability of the proposed AFQL method.
出处 《The Journal of China Universities of Posts and Telecommunications》 EI CSCD 2010年第4期100-109,共10页 中国邮电高校学报(英文版)
基金 supported by the National Natural Science Foundation of China (60703106)
关键词 Q-LEARNING FIS CONTINUOUS ADAPTATION Q-learning, FIS, continuous, adaptation
  • 相关文献

参考文献27

  • 1Hwang K S, Tan S W, Tsai M C. Reinforcement learning to adaptive control of nonlinear systems. IEEE Transactions on Systems, Man and Cybernetics, Part B: Cybernetics, 2003, 33(3): 514-521.
  • 2Preux P, Delepoulle S, Darcheville J C. A generic architecture for adaptive agents based on reinforcement learning. Information Sciences, 2004, 161 (1/2): 37-55.
  • 3Barto A G, Sutton R S, Anderson C W. Neuronlikc adaptive elements that can solve difficult learning control problems. IEEE Transactions on Systems, Man, and Cybernetics, 1983, 13(5): 834-846.
  • 4Sutton R, Precup D, Singh S. Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning, Artificial Intelligence, 1999, 112(2): 181-211.
  • 5Dietterich T. Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research, 2000, 13(1): 227-303.
  • 6Andre D, Russell S J. Programmable reinforcement learning agents.Advances in Neural Information Processing Systems 13. Cambridge, MA, USA: MIT Press, 2001:1019 1025.
  • 7Sutton R S. Generalization in reinforcement learning: successful examples using sparse coarse coding. Advances in Neural Information Processing Systems 8. Cambridge, MA, USA: MIT Press, 1996:1038-1044.
  • 8Albus J S. A new approach to manipulator control: the cerebellar model articulation controller (CMAC). Transactions of the ASME, Series G: Journal of Dynamic Systems, Measurement and Control, 1975, 97(3): 220-227.
  • 9Rummery G A. Problem solving with reinforcement learning. Ph.D. Thesis. Cambridge, UK: Cambridge University, 1995.
  • 10Ormoneit D, Sen S. Kernel-based reinforcement learning. Machine Learning, 2004, 49(2/3): 161-178.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部