期刊文献+

Robocup半场防守中的一种强化学习算法

A Reinforcement Learning Method for Robocup Soccer Half Field Defense
下载PDF
导出
摘要 Robocup仿真比赛是研究多Agent之间协作和对抗理论的优秀平台,提高Agent的防守能力是一个具有挑战性的问题。为制定合理的防守策略,将Robocup比赛中的一个子任务——半场防守任务分解为多个一对一防守任务,采用了基于Markov对策的强化学习方法解决这种零和交互问题,给出了具体的学习算法。将该算法应用到3D仿真球队——大连理工大学梦之翼(Fantasia)球队,在实际比赛过程中取得了良好效果。验证了采用Markov零和对策的强化学习算法在一对一防守中优于手工代码的结论。 Robocup soccer simulation is an excellent platform in which colhboration and counterwork among multi - agent are studied. It is a challenging problem to improve agent's defense ability. In order to design reasonable defending policy, decompose a subtask, half field defense, into some one- vs-one defense subtask and pose it as a problem of zero-sum Markov games. In this paper, a reinforcement learning method based on Markov game is developed and implemented in 3D simulation soccer team——DUT Fantasia. In real matches, this arithmetic is approved to be efficient and better than manual - coding in one- vs- one defense subtask.
机构地区 大连理工大学
出处 《计算机技术与发展》 2008年第1期59-62,共4页 Computer Technology and Development
基金 国家自然科学基金(50575031)
关键词 ROBOCUP 强化学习 MARKOV对策 零和对策 Robocup reinforcement learning Markov game zero-sum game
  • 相关文献

参考文献8

  • 1Kitano H,Tambe M,Stone P,et al. The RoboCup Synthetic agent challenge97[ C]//In Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence. Nagoya: [s.n. ] ,1997:24-29.
  • 2Stone P. Layered Learning in Multi- Agent Systems[ D ]. Pittsburgh, PA, USA: Computer Science Department, Carnegie Mellon University, 1998.
  • 3Yao Jinyi, Chen Jiang, Cai Yunpeng, et al. Architecture of Tsinghua Aeolus[ C] // In: Birk A, Coradeschi S, Tadokoro S eds. Robocup 2001 : Robot Soccer World Cup Ⅴ. Heidelberg: Springer-Verlag,2002.
  • 4Riedmiller M, Braun H. A direct adaptive method for faster back-propagation learning: The RPROP algorithm[ C]//In Ruspini H ed. Proceedings of the IEEE International Conference on Neural Networks (ICNN). San Francisco: [ s. n. ], 1993: 586 - 591.
  • 5Owen G. Game Theory[ M]. 2nd Edition. Orlando, FL, USA: Academic Press, 1982.
  • 6Littman M L. Markov games as a framework for multi - a- gem reinforcement learning [ C ] // In Proceedings of the Eleventh International Conference on Machine Learning. San Francisco, CA: Morgan Kaufmann, 1994 : 157 - 163
  • 7常庭懋,韩中庚.用“匈牙利算法”求解一类最优化问题[J].信息工程大学学报,2004,5(1):60-62. 被引量:21
  • 8Stone P, Veloso M. Task decomposition. dynamic role assignment, and low- bandwidth communication for real - time strategic teamwork[J]. Artificial Intelligence, 1999,110(2) : 241 - 273.

二级参考文献1

共引文献20

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部