期刊文献+

Multi-agent reinforcement learning based on policies of global objective

Multi agent reinforcement learning based on policies of global objective
下载PDF
导出
摘要 In general-sum games, taking all agent's collective rationality into account, we define agents' global objective, and propose a novel multi-agent reinforcement learning(RL) algorithm based on global policy. In each learning step, all agents commit to select the global policy to achieve the global goal. We prove this learning algorithm converges given certain restrictions on stage games of learned Q values, and show that it has quite lower computation time complexity than already developed multi-agent learning algorithms for general-sum games. An example is analyzed to show the algorithm' s merits. In general-sum games, taking all agent's collective rationality into account, we define agents' global objective, and propose a novel multi-agent reinforcement learning(RL) algorithm based on global policy. In each learning step, all agents commit to select the global policy to achieve the global goal. We prove this learning algorithm converges given certain restrictions on stage games of learned Q values, and show that it has quite lower computation time complexity than already developed multi-agent learning algorithms for general-sum games. An example is analyzed to show the algorithm' s merits.
出处 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2005年第3期676-681,共6页 系统工程与电子技术(英文版)
关键词 Markov games reinforcement learning collective rationality policy. Markov games, reinforcement learning, collective rationality, policy.
  • 相关文献

参考文献11

  • 1Sutton R S, Barto A. Reinforcement learning: an introduction. Cambridge, MA: MIT Press, 1998.
  • 2Watkins C J C H,Dayan P. Q-learning. Machine Learning, 1992, 8: 279~292.
  • 3Littman M L. Markov games as a framework for multi-agent reinforcement learning. 11 th IC ML, New Brunswick , 1994: 157~ 163.
  • 4Hu J, Wellman M P. Multiagent reinforcement learning:theoretical framework and an algorithm. 15th ICML,1998: 242~250.
  • 5Hu J, Wellman M P. Nash Q-learning for general-sm stochastic games. Journal of Machine Learning Research,2003, 1:1~ 30.
  • 6Boutilier C. Sequential optimality and coordination in multiagent systems. 16th International Joint Conference on Artificial Intelligence, Stockholm, 1999: 478~485.
  • 7Bowling M, Veloso M. Rational and convergent learning in stochastic games. 17th International Joint Conference on Artificial Intelligence, 2001: 1021 ~ 1026.
  • 8Bowling M. Convergence problems of general-sum multiagent reinforcement learning. Proc. 17th ICML, Stanford, CA, Morgan Kaufmann, San Francisco, CA,2000: 89~ 94.
  • 9Bowling M, Veloso M. Multiagent learning using a variable learning rate. Artificial Intelligence, 2002, 136:215~250.
  • 10Greenwald A, Hall K, Serrano R. Correlated-Q learning. NIPS Workshop on Multiagent Learning, 2002.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部