期刊文献+

多agent协同强化学习算法SE-MACOL及其应用 被引量:5

Multi-agent Cooperative Reinforcement Learning Algorithm SE-MACOL and Its Application
下载PDF
导出
摘要 针对多agent团队中各成员之间是协作关系且自主决策的学习模型,在此对Q学习算法进行了适当扩充,提出了适合于多agent团队的一种共享经验元组的多agent协同强化学习算法。其中采用新的状态行为的知识表示方法,采用相似性变换和经验元组的共享,能够提高多agent团队协同工作的效率。最后将该算法应用于猎人捕物问题域,实验结果表明该算法能够明显加快多个猎人合作抓捕猎物的进程。 This paper extends Q-learning algorithm properly to multi-agent cooperative team domain, in which members make their decisions independently,and proposes a shared experience tuples multi-agent cooperative reinforcement learning algorithm. A new knowledge representation form composed of sequential pair as (state-value,action-value) is proposed, and experience tuples are shared with other agents in one team by through similarity transformation according to homogeneous subtasks. By importing this learning algorithm,not only the space of state-action is reduced,but also the learning efficiency is improved ,and it shows that cooperation efficiency of the team is improved obviously. In the end ,the algorithm is applied to pursuit game domain,and the result shows the validity of the algorithm that it can speed up the progress of pursuit task.
出处 《广西师范大学学报(自然科学版)》 CAS 北大核心 2006年第4期167-170,共4页 Journal of Guangxi Normal University:Natural Science Edition
基金 国家自然科学基金资助项目(70371008)
关键词 多AGENT学习 强化学习 Q学习 状态行为空间 协作团队 multi-agent learning reinforcement learning Q-learning state-action space cooperative team
  • 相关文献

参考文献5

  • 1王珏,石纯一.机器学习研究[J].广西师范大学学报(自然科学版),2003,21(2):1-15. 被引量:77
  • 2KAELBLING L P,LITTMAN M L,MOORE A W.Reinforcement learning:A survey[J].Journal of Artificial Intelligence Research,1996(4):237-285.
  • 3RIBEIRO C.Reinforcement learning agents[C]//Artificial Intelligence Review.Netherlands:Kluwer Academic Publishers,2002(17):223-250.
  • 4蔡庆生,张波.一种基于Agent团队的强化学习模型与应用研究[J].计算机研究与发展,2000,37(9):1087-1093. 被引量:31
  • 5TAN Ming.Multi-agent reinforcement learning:Independent vs.cooperative agents[C]//Proceedings of the Tenth International Conference on Machine Learning.USA:Morgan Kaufmann Publishers,1993:330-337.

二级参考文献52

  • 1WienerN.控制论(中译本)[M].北京:科学出版社,1962..
  • 2Yao Y,Lin T. Generalization of rough sets using model logics[J]. Intelligent Automation and Soft Computing, 1996,2(2):103-120.
  • 3Skowron A,Rauszer C. The discernibility matrices and functions in information systems [A]. Slowinski R. Ifitelligent decision support-handbook of applications and advances of the rough sets theory[C]. Dordrecht :Kluwer Academic Publishers, 1992. 331-362.
  • 4Han J,Kamber M. Data mining:Concepts and techniques [M]. San Mateo :Morgan Kaufmann Publishers, 2000.
  • 5Zhou Yu-jian,Wang Jue. Rule + exception modeling based on rough set theory[A]. Polkowski L,Skowron A. Rough sets and current trends in computing[C]. Berlin :Springer, 1998. 529-536.
  • 6Kaelbling L,Littman M ,Moore A. Reinforcement learning :A survey[J]. Journal of Artificail Intelligence Research,1996,4:237-285.
  • 7Arbib M. Brains machines and mathematics[M]. New York :McGraw Hill companies, 1964.
  • 8Ashby W. Design for a brain the origin of adaptive behavior[M]. London :Chapman & Hall, 1950.
  • 9Holland J. Adaptation in natural and artificial systems[M]. Ann Arbor:University of Michigan Press ,1975.
  • 10Sutton R ,Barto A. Reinforcement learning :An introduction[M]. Cambridge ,MA :MIT Press, 1998.

共引文献106

同被引文献28

  • 1祝玉华,甄彤.基于Agent的分布式空间数据挖掘研究[J].微电子学与计算机,2005,22(6):1-4. 被引量:4
  • 2包剑,冀常鹏,李义杰.基于移动Agent的分布式入侵检测系统研究[J].长春理工大学学报(自然科学版),2006,29(4):80-83. 被引量:3
  • 3黎新华,张兆宁.基于Agent的空中交通流量管理系统结构研究[J].交通运输工程与信息学报,2007,5(1):56-61. 被引量:14
  • 4范明 孟小峰.数据挖掘概念与技术[M].北京:机械工业出版社,2001..
  • 5CHAKRABARTI S, DOM B E, KUMAR S R, et al. Mining the web's link structure[J]. Computer,1999,32(8):60- 67.
  • 6胡明华.空中交通流量管理理论[M].南京:南京航空航天大学出版社,2006
  • 7ADAMS M ,KOLITZ S, MILNER J,et al. Evolutionary concepts for decentralized air traffic flow management [J]. Air Traffic Control Quarterly, 1996,4 (10) : 281-306.
  • 8BALL M O,HOFFMAN R. Collaborative decision making air traffic management :a preliminary assessment ,Technical Report RR-99-S[R]. NEXTOR :National Center of Excellence for Aviation Operations Research,1998.
  • 9MINH NGUYEN-DUC,JEAN-PIERRE BRIOT,ALEXIS DROGOUL. An application of multi-agent coordination techniques in air traffic[C]//Proceedings of the IEEE/WIC International Conference on Intelligent Agent Technology, 2003. New York :IEEE Computer Press, 2003 : 699-703.
  • 10张洪海,胡明华,陈世林.基于Agents的终端区流量协同管理研究[C]//第七届全国交通运输领域青年学术会议论文集.北京:中国民航出版社,2007:528-532.

引证文献5

二级引证文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部