期刊文献+

基于蚁群算法和轮盘算法的多Agent Q学习 被引量:5

Multiagent Q-learning based on ant colony algorithm and roulette algorithm
下载PDF
导出
摘要 提出了一种新颖的基于Q-学习、蚁群算法和轮盘赌算法的多Agent强化学习。在强化学习算法中,当Agent数量增加到足够大时,就会出现动作空间灾难性问题,即:其学习速度骤然下降。另外,Agent是利用Q值来选择下一步动作的,因此,在学习早期,动作的选择严重束缚于高Q值。把蚁群算法、轮盘赌算法和强化学习三者结合起来,期望解决上述提出的问题。最后,对新算法的理论分析和实验结果都证明了改进的Q学习是可行的,并且可以有效地提高学习效率。 Authors present a novel Multiagent Reinforcement Learning Algorithm based on Q-Learning,ant colony algorithm and roulette algorithm.In reinforcement learning algorithm,when the number of agents is large enough,all of the action selection methods will be failed:the speed of learning is decreased sharply.Besides,as the Agent makes use of the Q value to choose the next action,the next action is restrainted seriously by the high Q value,in the prophase.So,authors combine the ant conlony algorithm,roulette algorithm with Q-learning,hope that the problems will be resolved with the algorithm proposed.At last,the theory analysis and experiment result both demonstrate that the improved Q-learning is feasible and increases the learning efficiency.
出处 《计算机工程与应用》 CSCD 北大核心 2009年第16期60-62,共3页 Computer Engineering and Applications
基金 吉林省科技发展计划项目(No.20070530)~~
关键词 多Agent强化学习算法 蚁群算法 轮盘赌算法 muhiagent reinforcement learning algorithm ant colony algorithm roulette algorithm
  • 相关文献

参考文献8

二级参考文献30

  • 1林明,朱纪洪,孙增圻.固定长度经验回放对Q学习效率的影响[J].计算机工程,2006,32(6):7-10. 被引量:1
  • 2Narendra P,Sandip S,Maria Gordin.Shared memory based cooperative coevolution[C].In:Proceedings of the 1998 IEEE International Conference on Evolutionary Computation.Alaska,IEEE Press, 1998:570-574.
  • 3M L Littman.Markov games as a framework for multiagent reinforcement leaming[C].In :Proceedings of the 11th International conference on Machine learning,1994.
  • 4Kaelbling L P,Littman M L,Moore A W.Reinforcement learning:A survey[C].In :Journal of AI Research,4:237-285.
  • 5Sandip Sen-Chair.Adaption,Coevolution and Learning in Multiagent Systems:Papers from the 1996 AAAI Spring Symposium[R].AAAI Press, AAAI Technical Report SS-96-01.
  • 6Weiss G,Dillenbourg P.What is multi in multiagent learning?Dillenbourg P,Collaborative learning,cognitive and computational approaches. Amsterdam:Pergamon Press, 1998:64-80.
  • 7M L Littman.Friend-or-foe :Q-learnlng in general-sum games[C].In: Proceedings of the Eighteenth International Conference on Machine Learning,2001.
  • 8J Hu.Best-response algorithm for muhiagent reinforcement learning. 2003.
  • 9Tom M Mitchell.Machine learning.McGraw-Hill Companies,Inc.1997: 367-387.
  • 10Watkins C,Dayan P.Q-learning.Maehine learning,8,1992:279-292.

共引文献117

同被引文献35

  • 1管霖,冯垚,刘莎,石东.大规模配电网可靠性指标的近似估测算法[J].中国电机工程学报,2006,26(10):92-98. 被引量:61
  • 2欧阳东,陈国荣,王滨.智能建筑办公环境的无线联网节能控制系统及其控制方法:中国,101706663A[P].2010-05.12.
  • 3杨柱勇,吴寿克,罗维芳,等.智能办公大楼环境管理系统[EB/OL].(2010).http://dbpub.cnki.net/Grid2008/Dbpub/detail.aspx?filename=SNAD000001362382&dbname=SNAD&uid.WEEvREcwSlJHSldSdnQlZkklYkpFRWZ-kS2JFVEl6aDhZMDl0N2NDcXpxQkt6R1BzeW9yMGRJd-jzEQINxMw恤PQ==.
  • 4Jennings N R, Sycara K,Wooldridge M.A roadmap of Agent research and development[J].Autonomous Agent and Multi-Agent System, 1998,1 ( 1 ) :275-306.
  • 5Jouffe L.Fuzzy inference system leaning by reinforcement methods[J].IEEE Transactions on Systems,Man,and Cy- bernetics, Part C : Applications and Reviews, 1998,28 (3) : 338-355.
  • 6Watkins C J C H,Dayan P.Q learning[J].Machine Learn- ing, 1992,8 (3) : 279-292.
  • 7Yao Jinyi.An application in RoboCub combining Q-learn- ing with adversarial planning[C]//Proceedings of the 4th World on Intelligent Control and Automation, Shanghai, China, 2002 : 10-14.
  • 8王明常,何月,许军强,王志恒.基于GIS的长白山景观格局演化信息图谱分析[J].测绘科学,2008,33(6):33-35. 被引量:8
  • 9张宝光.人工神经网络在遥感数字图像分类处理中的应用[J].国土资源遥感,1998,10(1):21-27. 被引量:36
  • 10黎夏,刘小平,何晋强,李丹,陈逸敏,庞瑶,李少英.基于耦合的地理模拟优化系统[J].地理学报,2009,64(8):1009-1018. 被引量:45

引证文献5

二级引证文献42

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部