期刊文献+

Q学习算法在RoboCup带球中的应用 被引量:3

Application of Q-Learning Algorithm in Dribbling Ball Training of RoboCup
下载PDF
导出
摘要 机器人世界杯足球锦标赛(RoboCup)是全球影响力最大的机器人足球比赛之一,而仿真组比赛是其重要的组成部分。鉴于带球技术在仿真组比赛中的重要性,我们将Q学习算法应用于带球技术训练中,使智能体本身具有学习和适应能力,能够自己从环境中获取知识。本文描述了应用Q学习算法在特定场景中进行1vs.1带球技术训练的方法和实验过程,并将训练方法应用于实际球队的训练之中进行了验证。 RoboCup is one of the most influential Robot soccer games,and the simulation game is an important part of it. Whereas the skill of dribbling ball has a very important place in RoboCup game, by trying to introduce Q-Learning method into the training of dribbling ball, we make the agent owning the ability to learn and adaptive to the environment, so that it can gain knowledge from the environment. This paper illustrates the Q-Learning algorithm in the special area 1 vs. 1 dribbling strategy train learning method and training process, the training method is applied to everyday training and its effect was validated.
出处 《系统仿真技术》 2005年第2期84-87,共4页 System Simulation Technology
关键词 强化学习 机器人世界杯足球锦标赛 带球 reinforcement learning RoboCup dribbling ball
  • 相关文献

参考文献2

  • 1[4]Kostiadis Kostas,Hu Huosheng.Reinforcement Learning and Co-operation in a Simulated Multi-agent System[A].In:Proceedings of the 1999 IEEE/RSJ International Conference on Intelligent Robots and Systems 1999[C],17-21 Oct.1999,2:990-995.
  • 2[6]Piao Songhao,Hong Bingrong.Fast Reinforcement Learning Approach to Cooperative Behavior Acquisition in Multi-agent System[A].In:Proceedings of the 2002 IEEE/RSJ International Conference on Intelligent Robots and Systems[C],30 Sept.-5 Oct.2002,1:871-875.

同被引文献22

  • 1Roozemond D A.Using intelligent agents for pro-active, real-time urban intersectioin contro[J].European Journal of Operational Re- search,2001,131:293-301.
  • 2Glorennec P Y, Jouffe L.fuzzy Q-learning[C]//Proceedings of the 6th IEEE International Conference on Fuzzy Systems, 1-5 July, 1997,2: 659-662.
  • 3Watldns C.Q-leaming[J].Machine Learning, 1992,8 (3) : 279-292.
  • 4Kaelbling L P, Litlman M L, Moore A W.Reinforcement learning: a survey[J].Journal of Artificial Intelligence, 1996,4: 237-285.
  • 5Watkins C J C H. Learning from delayed rewards[D].Cambridge,UK:King’ s College,University of Cambridge,1989.1-55.
  • 6Cheng Ke. Multi-robot coalition formation for distributed area coverage[D].Omaha,USA:Computer Science De-partment,University of Nebraska,2011.3-55.
  • 7Minsky M L. Theory of neural-analog reinforcement systems and its application to the brain-model problem[M].Princeton,USA:Princeton University,1954.5-23.
  • 8Fazli P,Davoodi A,Mackworth A K. Multi-robot repeated area coverage:Performance optimization under various visual ranges[A].Toronto,Canada:IEEE,2012.298-305.
  • 9Hazon N,Mieli F,Kaminka G A. Towards robust on-line multi-robot coverage[A].Singapore City,Singapore:IEEE,2006.1710-1715.
  • 10Jeon H S,Ko M C,Oh R. A practical robot coverage algorithm for unknown environments[M].Berlin,Germany:Springer-Verlag Berlin Heidelberg,2010.129-140.

引证文献3

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部