期刊文献+

基于强化学习的多Agent协作研究 被引量:5

Cooperative Multi-agent Systems Based on Reinforcement Learning
下载PDF
导出
摘要 强化学习为多 Agent之间的协作提供了鲁棒的学习方法 .本文首先介绍了强化学习的原理和组成要素 ,其次描述了多 Agent马尔可夫决策过程 MMDP,并给出了 Agent强化学习模型 .在此基础上 ,对多 Agent协作过程中存在的两种强化学习方式 :IL(独立学习 )和 JAL(联合动作学习 )进行了比较 .最后分析了在有多个最优策略存在的情况下 ,协作多 Reinforcement learning can provide a robust and natural means for agents to learn how to coordinate their action choices in fully cooperative multi agent systems (MAS). This paper first introduces the basic principles and components of reinforcement learning, then describes multi agent extension MMDP and presents reinforcement learning model of agents in cooperative MAS. After that we distinguish reinforcement learners that ignore the presence of other agents from those that explicitly attempt to learn the value of joint actions and strategies of their counterparts. In the last, some simple and commonly used coordination mechanisms are examined.
出处 《小型微型计算机系统》 CSCD 北大核心 2003年第11期1986-1988,共3页 Journal of Chinese Computer Systems
基金 安徽省自然科学基金 ( 0 0 0 43 115 )资助
关键词 多AGENT系统 强化学习 MMDP 协调机制 multi agent system reinforcement learning multi agent MDP coordination mechanisms
  • 相关文献

参考文献8

  • 1骆正虎,杨敬安,骆祥峰,郑淑丽,张浩.基于移动Agent的分布式计算模型研究[J].小型微型计算机系统,2002,23(3):300-304. 被引量:27
  • 2Richard S. Sutton & Andrew G. Reinforcement learning: an introduction[M]. MIT Press, Cambridge, MA. 1998 A.
  • 3Kaelhling L P, Littman M &Moore A. Reinforcement learning: a survey[J]. Journal of Artificial Intelligence Research. 1994.(4): 237-285.
  • 4Buffet O, Dutech A and Charpillet F. Incremental reinforcemen learning for designing multi agents systems[C]. In: Proceedings of the Fifth International Conference on Autonomous Agents (Agems'01). Montreal 2001.
  • 5Claus C. Boutilier C. The dynamics of reinforcemem learning to cooperalive muhi-agem systems[C]. In: Proceedings oi the Fifteenth National Conference on Artificial Intelligence. 1998. 746-752.
  • 6Boutilier C. Sequential optimality and coordination in multi-agent systems[C]. In: Proceedings of the Sixteenth international Joint Conferences on Artificia! Intelligence (IJCAI-99). july 1999.
  • 7Littman M L. Szepesvaric. A generalized reinforcement-learning model: convergence and applications[C]. In: Saitta Led. Proc of the i3th Int'l on Machine Learning. Earl Italy: Morgan Kanfmann. 1996.310-318 .
  • 8Singh S. Jaakkola T & Jordan M. Learning without stateesumation in partially observable markovian decision processes[C]. In: Proceeding o{ the Eleventh International Conference on Machine Learning. 1994.

二级参考文献10

  • 1史忠植.智能主体及其应用[M].北京:科学出版社,2001..
  • 2David Wong, Noemi Paciorek, Dana Moore. Java-based mobile agents[J]. Communications of the ACM, March 1999.42(3): 92~102.
  • 3Chess D., Harrison C., Kershenbaum A. Mobile agents: are they a good idea[C] In Proceedings of the Second International Workshop on Mobile Object Systems, Linz, July 1996
  • 4Todel Sundsted.An introduction to agents. Technical Report.Available at:http://www javaworld com/Javaworld /jw-06-1998/jw-06-howto html.1998
  • 5White J. Telescript technology: mobile agents[M]. In Software Agents, J. Bradshaw Ed., MIT Press. 1996
  • 6Strasser M., Baumann J., Houl F. Mole-a Java based mobile agent system[C]. In Proceedings of the Second International Workshop on Mobile Object Systems, Linz, July 1996
  • 7Robert Gray. Agent Tcl: a transportable agent system[C]. In Proceedings of the Fourth International Conference on Information and Knowledge Management (CIKM95), Baltimore, Maryland, Dec.1995
  • 8Danny B. Lange. Mitsuru Oshima. Programming and deploying mobile agents with Java[M]. Addison-Wesley, Reading, M.A. 1998
  • 9Joseph Kiniry, Daniel Zimmerman. A hands-on look at Java mobile agents[J]. IEEE Internet Computing, 1997, 7, 21~30
  • 10张浩,骆正虎,杨敬安.基于Java语言的移动Agent开发平台[J].合肥工业大学学报(自然科学版),2001,24(5):907-912. 被引量:15

共引文献26

同被引文献38

  • 1王进发,李励,李仕明.军事供应链的结构柔化[J].军事运筹与系统工程,2005,19(1):23-28. 被引量:9
  • 2夏莉,黄晶晶.期权定价理论与分阶段投资决策[J].商业研究,2004(16):113-114. 被引量:6
  • 3周浦城,洪炳镕,黄庆成.一种新颖的多agent强化学习方法[J].电子学报,2006,34(8):1488-1491. 被引量:8
  • 4黄炳强,曹广益,王占全.强化学习原理、算法及应用[J].河北工业大学学报,2006,35(6):34-38. 被引量:19
  • 5Alfredsson P.Flexible Supply:The Next Step in the Evolution of Sparing Strategies[C]//SOLE 2000 35th Annual Proceedings,[S.l]:SOLE,2000.
  • 6Lawson E,Ferris T,Cropley D,et al.Development of A Foundation for Military Network Science[R/OL].[2009-4-2].http://arrow.unisa.edu.au:8081/1959.8/47987.
  • 7Kshanti Greene,David Cooper G,Michael Czajkowski,et al.A Cognitive Agent Architecture Optimized for Adaptivity[C]//DAMAS LNAI3890.Heidelberg:Spring Berlin,2006:104-120.
  • 8Gutknecht J O,Michel F.From Agents to Organizations:An Organizational View of Multi-agent Systems[C]//AOSE Australia:AasE Melbourne,2003:214-230.
  • 9Sutton R S,Barto A G..Reinforcement Learning[M].MA:MIT Press,1997.
  • 10Tan Ming.Multi-agent Reinforcement Learning:Independent vs Cooperative Agent[C]// In Proceedings of the 10th International Conference on Machine Learning (ICML-93),San Fransisco:Morgan Kaufmann Publisher Inc,1993:487-494.

引证文献5

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部