期刊文献+

基于强化学习的多智能体协作实现 被引量:2

Multi-agent cooperation based on reinforcement learning
下载PDF
导出
摘要 基于马尔科夫过程的强化学习作为一种在线学习方式,能够很好地应用于单智能体环境中。但是由于强化学习理论的限制,在多智能体系统中马尔科夫过程模型不再适用,因此强化学习不能直接用于多智能体的协作学习问题。本文提出了多智能体协作的两层强化学习方法。该方法主要通过在单个智能体中构筑两层强化学习单元来实现。第一层强化学习单元负责学习智能体的联合任务协作策略,第二层强化学习单元负责学习在本智能体看来是最有效的行动策略。所提出的方法应用于3个智能体协作抬起圆形物体的计算机模拟中,结果表明所提出的方法比采用传统强化学习方法的智能体协作得更好。 Reinforcement learning based on Markov decision process is a way of on-line learning, which can be applied to single agent environment. However, due to the theoretical limitation that it assumes that an environment is Markovian, traditional reinforcement learning algorithms cannot be applied directly to multi-agent system. In this paper, a two-layer reinforcement learning method for multi-agent cooperation is presented. The proposed method is realized by adding two-layer reinforcement learning units to every agent. The first layer is for learning global cooperation strategy, and the second layer is for learning efficient action policy in one's own view. An experiment that three agents raise a disk-like object cooperatively has been done. Results show that the cooperative performance with the presented method is better than that using traditional reinforcement learning.
出处 《浙江工业大学学报》 CAS 2004年第5期516-519,572,共5页 Journal of Zhejiang University of Technology
基金 浙江省自然科学基金项目(601078)
关键词 强化学习 多智能体系统 协作策略 马尔科夫过程 单元 在线学习 模型 习作 协作学习 物体 reinforcement learning Q-learning multi-agent cooperation
  • 相关文献

参考文献13

  • 1Piao S, Hong B. Fast reinforcement learning approach to cooperative behavior acquisition in multi-agent system[A]. Proceedings of the 2002 IEEE/RSJ International Conference on Intelligent Robots and Systems EPFL[C]. Lausanne ,Switzerland, 2002. 871-875.
  • 2Mataric M J. Reinforcement learning in the multi-robot domain[J]. Autonomous Robots, 1997, 4(1):73-83.
  • 3Stone P, Veloso M. Using machine learning in the soccer server[A]. Proceedings of the IROS-96 Workshop on RoboCup [C]. Osaka, 1996.
  • 4Suematsu N,Hayashi A. A multiagent reinforcement learning algorithm using extended optimal response[A]. Proceedings of the First International Joint Conference on Autonomous Agents & Multiagent Systems[C]. Bologna Italy, 2002. 370-377.
  • 5Hu J,Michael Wellman P. Multiagent reinforcement learning: theoretical framework and an algorithm[A]. Proceedings.15th International Conf[C]. on Machine Learning, 1998. 242-250.
  • 6Claus Caroline,Boutilier Craig. The dynamics of reinforcement learning in cooperative multiagent systems[A]. Proc[C].Workshop on Multi-Agent Learning, 1997. 602-608.
  • 7孟伟,洪炳熔,韩学东.强化学习在机器人足球比赛中的应用[J].计算机应用研究,2002,19(6):79-81. 被引量:11
  • 8李晓萌,杨煜普,许晓鸣.基于Markov对策和强化学习的多智能体协作研究[J].上海交通大学学报,2001,35(2):288-292. 被引量:7
  • 9蔡庆生,张波.一种基于Agent团队的强化学习模型与应用研究[J].计算机研究与发展,2000,37(9):1087-1093. 被引量:31
  • 10高阳,周志华,何佳洲,陈世福.基于Markov对策的多Agent强化学习模型及算法研究[J].计算机研究与发展,2000,37(3):257-263. 被引量:30

二级参考文献14

共引文献64

同被引文献18

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部