期刊文献+

改进的模糊Sarsa学习

An Improved Fuzzy Sarsa Learning
原文传递
导出
摘要 为了解决模糊Sarsa学习(FSL)无法在线自适应调节学习因子和不能处理学习过程中探索与利用的平衡问题,提出了一种改进的模糊Sarsa学习(IFSL)算法.在FSL基础上,引入自适应学习率产生器来在线调节学习因子,增加模糊平衡器控制探索和利用的程度.给出了IFSL的结构框图,证明了IFSL中可调节权向量具有平衡不动点.仿真结果表明,与FSL相比,IFSL能加快系统的学习收敛速度,具有较好的学习性能. It is difficult for fuzzy Sarsa learning(FSL) to tune learning rate and balance exploration vs. exploitation, so an improved FSL(IFSL) method based on FSL is presented. In the method, an adaptive learning rate generator for tuning learning rate on-line and a fuzzy balaneer for controlling the degree of exploration vs. exploitation are introduced. The diagram of IFSL is given, and the weight vector of IFSL with stationary action selection policy converges to a unique value is proved. Simulation results show that IFSL well manager balance, and outperforms FSL in terms of learning speed and action quality.
出处 《北京邮电大学学报》 EI CAS CSCD 北大核心 2011年第2期31-34,44,共5页 Journal of Beijing University of Posts and Telecommunications
基金 国家自然科学基金项目(60974019) 广东省自然科学基金项目(9451009001002686)
关键词 强化学习 模糊控制 模糊Sarsa学习 探索 利用 reinforcement learning fuzzy control fuzzy Sarsa learning exploration exploitation
  • 相关文献

参考文献6

  • 1Gosavi A. Reinforcement learning: a tutorial survey and recent advances [J]. INFORMS Journal on Computing, 2009, 21(2) : 178-192.
  • 2Derhami V, Majd V, Ahmadabadi M N. Exploration and exploitation balance management in fuzzy reinforcement learning[J]. Fuzzy Sets and Systems, 2010, 161 (4):578-595.
  • 3Alba E, Dorronsoro B. The exploration/exploitation tradeoff in dynamic cellar genetic algorithms [ J]. IEEE Transactions on Evolutionary Computation, 2005, 9 ( 2 ) : 126-143.
  • 4Tan K C, Chiam S C, Mamun A A. Balancing exploration and exploitation with adaptive variation for evolutionary muhi-objective optimization [ J ]. European Journal of Operational Research, 2009, 197(2) : 701-713.
  • 5Vali D, Vahid J M, Majid N A. Fuzzy sarsa learning and the proof of existence of its stationary points [ J ]. Asian Journal of Control, 2008, 10 (5) : 535-549.
  • 6Juang C F, Hsu C H. Reinforcement interval type-2 fuzzy controller design by online rule generation and Q-value- aided ant colony optimization [ J ]. IEEE Transactions on Systems Man and Cybernetics Part B-Cybernetics, 2009, 39(6) : 1528-1542.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部