期刊文献+

并行强化学习算法及其应用研究 被引量:7

Parallel reinforcement learning algorithm and its application
下载PDF
导出
摘要 强化学习是一种重要的机器学习方法,然而在实际应用中,收敛速度缓慢是其主要不足之一。为了提高强化学习的效率,提出了一种并行强化学习算法。多个同时学习,在各自学习一定周期后,利用D-S证据利用对学习结果进行融合,然后在融合结果的基础上,各进行下一周期的学习,从而实现提高整个系统学习效率的目的。实验结果表明了该方法的可行性和有效性。 Reinforcement learning is an important machine learning method.However,slow convergence has been one of main problem in practice.To improve the efficiency of reinforcement learning,this paper proposes parallel reinforcement learning algorithm.There are multiple agents in learning system.In a learning episode ,each agent learns independently.After a learning episode, the results of all agents are fused based on D-S evidence theory so as to achieve common result, which are shared by all agents in next learning episode.Experiments show the feasibility and efficiency of the algorithm.
作者 孟伟 韩学东
出处 《计算机工程与应用》 CSCD 北大核心 2009年第34期25-28,52,共5页 Computer Engineering and Applications
基金 国家"十一五"科技支撑计划重大项目资助No.2006BAD03A02~~
关键词 并行算法 强化学习 Q-学习 D—S证据理论 路径规划 parallel algorithms reinforcement learning Q-learning D-S evidence theory path plan
  • 相关文献

参考文献11

二级参考文献84

  • 1赵丽,董红斌.多Agent系统在RoboCup中的应用[J].哈尔滨师范大学自然科学学报,2005,21(2):40-45. 被引量:2
  • 2杜春侠,高云,张文.多智能体系统中具有先验知识的Q学习算法[J].清华大学学报(自然科学版),2005,45(7):981-984. 被引量:21
  • 3李楠,刘国栋.内在激励强化学习及其在Robocup仿真中的应用[J].计算机仿真,2006,23(4):160-162. 被引量:3
  • 4宋清昆,胡子婴.基于经验知识的Q-学习算法[J].自动化技术与应用,2006,25(11):10-12. 被引量:7
  • 5Wang B N,Gao Y,Chen Z Q,et al.LMRL:a multi-agent reinforcement learning model and algorithm[C]//Proceedings of Third International Conference on Information Technology and Applications (ICITA'05), 2005.
  • 6Piao S H,Hong B R.Fast reinforcement learning approach to cooperative behavior acquisition in multi-agent system[C]//Proceedings of the 2002 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2002,1 : 871-875.
  • 7Kostas K,Hu H S.Reinforcement learning and co-operation in a simulated multi-agent system[C]//Proceedings of the 1999 IEEE/RSJ International Conference on Intelligent Robots and Systems,1999: 990-995.
  • 8Yang E F,Gu D B.A multiagent fuzzy policy reinforcement learning algorithm with application to leader-follower robotic systems[C]// Proceedings of the 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2006: 3197-3202.
  • 9White R W.Motivation reconsidered:The concept of competence[J]. Psychological, Review, 1959,66.
  • 10Barto A G,Singh S,Chentanez.Intrinsically motivated learning of hierachical collections of skills[C]//Proceedings of the 3rd International Conference on Developmental Learning (ICL'04),LaJolla CA, 2004.

共引文献24

同被引文献37

  • 1童亮,陆际联,龚建伟.一种快速强化学习方法研究[J].北京理工大学学报,2005,25(4):328-331. 被引量:4
  • 2陈宗海,文锋,聂建斌,吴晓曙.基于节点生长k-均值聚类算法的强化学习方法[J].计算机研究与发展,2006,43(4):661-666. 被引量:13
  • 3宋清昆,胡子婴.基于经验知识的Q-学习算法[J].自动化技术与应用,2006,25(11):10-12. 被引量:7
  • 4Sutton R S,Barto A G.Reinforcement learning.[s.l.]:MIT Press,1998.
  • 5Weng Juyang.On developmental mental architectures.Neruocomputing,2007;70:2303-2323.
  • 6Watkins C J C H,Dayan P.Q-learning.Machine Learning,1994;8(3):279-292.
  • 7Fierro R,Lewis F L.Control of a nonholonomic mobile robot using neural networks.IEEE Transcation on Neural Networks.1998;9(4):589-600.
  • 8Yager R.On the dempster shafer framework and new combination rules.Information Sciences,1997;(41):93-137.
  • 9Singh S P, Jaakola T, Jordan M I. Neural Information Processing Systems [M]. Cambridge, Massachusetts: MIT Press, 1995: 361-368.
  • 10LaTorre A, Pena J M, Muelas S, et al. l.earning hybridization strategies in evolutionary algorithms[J]. Intelligent Data Analysis, 2010. 14(3): 333-354.

引证文献7

二级引证文献20

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部