期刊文献+

一种在多Agent系统中求帕累托效率解的方法

Method to seek Pareto improvement in multi-Agent system
下载PDF
导出
摘要 著名的Robert Axelrod实验证明了具备善意的、宽容的、强硬的和简单明了的算法将总会是赢家。基于这种思想,设计了PESCO算法。它可以在合作博弈中,面对合作的对手,寻求帕累托效率解,尽可能地达到双赢的局面。也可以在非合作博弈中,或对手不合作时,保证安全收益。以可合作的供零博弈、Stackelberg博弈和非合作的猜硬币博弈为背景,将PESCO算法与几个算法进行博弈,PESCO算法取得了较好的效果。 The famous Robert Axelrod experiment proves that with good will, tolerant, strong and simple algorithm always is a winner.Based on this thought,PESCO algorithm is designed.It seeks the Pareto efficient solution, achieves the win-win situa- tionas as far as possible, if the opponents are cooperative in the cooperative game.In the non-cooperative game, or opponent is uncooperative, it can be ensured safety in the proceeds.And by the Supplier-Retailers game, Stackelberg game and non-co- operative guessing the coin game, it is carried on gambling the PESCO algorithm with several algorithms,the PESCO algo- rithm has obtained good results.
出处 《计算机工程与应用》 CSCD 北大核心 2010年第22期229-232,共4页 Computer Engineering and Applications
关键词 Tit-for-Tat算法 Q学习 帕累托效率解 Tit-for-Tat Q-learning Pareto efficient solution
  • 相关文献

参考文献12

  • 1Watkins C,Dayan P.Technical note: Q-leaming[J].Machine Learning, 1992,8 (3/4) : 279-292.
  • 2Bowling M,Veloso M.Multiagent learning using a variable learning rate[J].Artificial Intelligence,2002,136:215-250.
  • 3Weinberg M, Rosenschein J S.Best-response multiagent learning in non-stationary environments[C]//Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems,IEEE,2004 : 506-513.
  • 4Uther W,Veloso M.Adversarial reinforcement leaming[R].Camegie Mellon University, 1997.
  • 5Singh S, Keams M, Mansour Y.Nash convergence of gradient dynamics in general-sum games[C]//Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence, 2000: 541-548.
  • 6Bowling M,Veloso M.Rational and convergent learning in stochastic games[C]//Proceedings of the Seventeenth International Joint Conference on Aritificial Intelligence,2001.
  • 7Zinkevich M.Online convex programming and generalized infinitesimal gradient ascent[C]//Proceedings of the Twentieth International Conference on Machine Learning,2003:925-928.
  • 8Bowling M.Convergence and no-regret in multiagent learning[J]. Advances in Neural Information Processing Systems, 2005, 17: 209-216.
  • 9Powers R, Shoham Y, Vu T.A general criterion and an algorithm framework for learning in multi-agent systems[J].Machine Learning, 2006,67 (1/2) :45-76.
  • 10Vu T, Powers R, Shoham Y.Leaming against multiple opponents[C]//Proceedings of the Fifth International Joint Conference on Autonomous Agents and Multi Agent Systems,2006.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部