NDSocTeam仿真机器人足球队的设计和实现被引量：1

NDSocTeam: Design and Implementation

下载PDF

导出

摘要机器人足球(RoboCup)是研究多agent系统的体系结构、多agent团队合作理论以及机器学习方法的理想测试平台.介绍了开发的仿真球队NDSocTeam系统的设计原理和实现技术.系统设计了以机器学习技术为核心的球员agent结构,并建立了一种分层学习以及多种学习技术相结合的机器学习系统.重点描述了NDSocTeam系统的总体结构、球员agent的结构以及机器学习的实现技术. RoboCup is a particularly ideal platform for studying the architecture of the multi-agent system, the multi-agent teamwork and machine learning methods. It has a great appeal to researchers in the artificial intelligence area. This paper mainly describes the infrastructure of NDSocTeam, the architecture of the agent and the realization of machine learning methods. Since the learning capability of the agent is critical to the robotic simulation team, we have designed the agent architecture focused on the machine learning aspect. First, we introduce an agent architecture in NDSocTeam that allows agents to decompose the task space. Since learning a mapping directly from agents' sensors to their actuators is intractable, the leaning tasks are hierarchically divided into four layers from the basic skill layer to the strategy layer. Different machine learning methods are applied to different layers, such as the neural network, reinforcement learning, C4.5, and soon. Second, we introduce the machine learning system in NDSocTeam that is featured with the layered learning and the combination of various learning methods. Given a hierarchical task decomposition, the layered learning allows learning at each level of the hierarchy. Third, a new reinforcement learning algorithm in NDSocTeam, reinforcement backward propagation algorithm (RBPA), is discussed. On the basis of the feed-backward neural network representing the value function, RBPA is used to exploit the most optimal policy. This is done because the state space is continuous and therefore has inherently lots of state-action pairs. Finally, established with the specific agent architecture and layered machine learning system, NDSocTeam is proved to have a desirable performance when competing with the former world champion, ATTCMUnited 2000.

作者杨佩赵志宏陈兆乾

机构地区南京大学计算机软件新技术国家重点实验室南京大学软件学院

出处《南京大学学报（自然科学版）》 CAS CSCD 北大核心 2003年第5期451-458,共8页 Journal of Nanjing University（Natural Science）

基金国家自然科学基金(699051001 60003010)

关键词仿真机器人足球队 NDSocTeam系统 AGENT系统机器学习系统设计体系结构 RoboCup, agent architecture, machine learning

分类号 TP242 [自动化与计算机技术—检测技术与自动化装置] TP18 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

参考文献13

1王立春,高阳,陈世福.AODE中基于强化学习的Agent协商模型[J].南京大学学报（自然科学版）,2001,37(2):135-141. 被引量：14
2李静,骆斌,陈兆乾,陈世福.RoboCup中基于效果操作的动态行为规划模型[J].南京大学学报（自然科学版）,2003,39(5):467-475. 被引量：3
3李宁高阳陈世福.AODE中的学习agent研究[J].南京大学学报(自然科学),2000,36(11):210-217.
4Kitano H, Asada M. RoboCup: The robot world cup initiative. Proceedings of the First International Conference on Autonomous Agent. New York: ACM Press, 1997:340-347.
5Stone P, Veloso M. Task decomposition, dynamic role assignment and low-bandwidth communication for real-time strategic teamwork. Artificial Intelligence, 1999, 110 (2) : 241-273.
6Stone P, Veloso M. A layered approach to learning client behaviors in the RoboCup soccer server. Applied Artificial Intelligence, 1998, 12: 165-188.
7Watkins C. Learning from delayed rewards. PhD thesis. Cambridge: Kings College, UK, 1989.
8Stone P, Veloso M. Team-partitioned, opaque-transition reinforcement learning. Asada M, Kitano H.RoboCup-98: Robot Soccer World Cup II. Berlin: Springer Verlag, 1999:135.
9Tambe M, Johnson W L, Jones R, et al. Intelligent agents for interactive simulation environments.Artificial Intelligence, 1995, 16(1):15-39.
10Marsella S, Adibi J, Al-Onaizan Y, et al. Experiences acquired in the design of RobpCup teams: A comparison of two rielded teams. Autonomous Agents and Multi-agent Systems, special - on Best of agents'99. New York: ACM Press, 2001, 4: 115-129.

二级参考文献20

1Kitano H, Tambe M, Stone P, et al. The RoboCup synthetic agent challenge97. Proceedings of the Fifteenth International Joint Conference on Artifidal Intelligence, 1997: 24-29.
2Williams B T. Effects-based operations: Theory, application and the role of airpower, http://www. ivaz.org. uk/military/resoure/airpower/Willianms B T 02. pdf,2002.
3Bell B, Santos Eugene Jr, Brown S M. Making adversary decision modeling tractable with intent inference and information fusion, http://www. atl. external., lmco. com/overview/OLDHTML/papeva/1069.pdf,2002.
4Dha,D W, Chang, L W. Cooperative bayesian and case-based reasoning for solving nmlti-agent planning tasks.Technical Report. AIC- 96 - 005, Navy Center for Applied Research in Artificial Intelligence, 1996.
5Breese J, Heckerman D. Decision-theoretic case-based reasoning.Proccedings of the Fifth International Workshop on Artificial Intelligence and Statistics, 1995 : 56-63.
6Rodriguez A, Vadera S, Sucar L E. A probabilistic model for case-based reasoning. Leake D B, Plake Y E. Case-Based Reasoning Research and Development. Berlin: Spring-Verlag, 1997:623-632.
7Tirri H, Kontkanen P, Myllymaksi P. A Bayesian framework for case-based reasoning. Smith I, Faltings B. Advances in Case-Based Reasoning. EWCBR-96, 1996:413-427.
8Charniak E. Bayesian networks without tears. AI Magazine, 1991, 12(4): 50--63.
9Sanguesa R, Cortes U. The Bayesian agent: An incremental approach for learning agents working under uncertainty, http://citeseer. nj. nec. com/464959. html. 2002.
10Wong T T, Hsu C N. Bayesian networks for Medicare expert systems, http://mist.med. org. tw/mist99/Proceeding-PDF/Microsoft% 20Word%20- %20chunnan111. 631107. pdf,2002.

共引文献15

1丛春瑜,刘家勋.基于Web的网络智能学习系统的开发[J].江苏广播电视大学学报,2004,15(3):50-52.
2丛春瑜.利用Web Services实现网络个性化学习[J].宁波广播电视大学学报,2004,2(3):62-64.
3丛春瑜.一个具有社会意识的个性化E-learning系统[J].现代远距离教育,2004(4):58-60. 被引量：1
4谢从华,宋余庆,朱玉全,王立军.基于网格化的医学图像不规则特征提取方法[J].计算机工程与应用,2005,41(28):52-54. 被引量：2
5蔡大鹏,张书杰.DDSS中多Agent协商联盟的构建与算法分析[J].计算机工程,2005,31(23):22-24.
6梁万杰,赵建民,朱信忠.RoboCup比赛机器人集成化技术初探[J].微型电脑应用,2008,24(1):21-24. 被引量：1
7黄海平,王汝传,王翠.基于移动Agent的无线传感器网络中间件[J].南京大学学报（自然科学版）,2008,44(2):157-163. 被引量：10
8廉佐政,王海珍,邓文新,滕艳平.应用记忆演化学习的Agent协商研究[J].计算机工程与应用,2009,45(19):131-133. 被引量：1
9蒋勋,卞艺杰,唐明伟.基于本体的多Agent的自动协商模型研究[J].情报杂志,2010,29(9):148-151. 被引量：7
10蒋勋,卞艺杰,吴铭峰.面向商务服务的Agent语义协商的知识模型[J].无锡商业职业技术学院学报,2011,11(1):22-25.

同被引文献17

1Kitano H, Tambe M, Stone P, et al. The RoboCup synthetic agent challenge97. Proceedings of the Fifteenth International Joint Conference on Artifidal Intelligence, 1997: 24-29.
2Williams B T. Effects-based operations: Theory, application and the role of airpower, http://www. ivaz.org. uk/military/resoure/airpower/Willianms B T 02. pdf,2002.
3Bell B, Santos Eugene Jr, Brown S M. Making adversary decision modeling tractable with intent inference and information fusion, http://www. atl. external., lmco. com/overview/OLDHTML/papeva/1069.pdf,2002.
4Dha,D W, Chang, L W. Cooperative bayesian and case-based reasoning for solving nmlti-agent planning tasks.Technical Report. AIC- 96 - 005, Navy Center for Applied Research in Artificial Intelligence, 1996.
5Breese J, Heckerman D. Decision-theoretic case-based reasoning.Proccedings of the Fifth International Workshop on Artificial Intelligence and Statistics, 1995 : 56-63.
6Rodriguez A, Vadera S, Sucar L E. A probabilistic model for case-based reasoning. Leake D B, Plake Y E. Case-Based Reasoning Research and Development. Berlin: Spring-Verlag, 1997:623-632.
7Tirri H, Kontkanen P, Myllymaksi P. A Bayesian framework for case-based reasoning. Smith I, Faltings B. Advances in Case-Based Reasoning. EWCBR-96, 1996:413-427.
8Charniak E. Bayesian networks without tears. AI Magazine, 1991, 12(4): 50--63.
9Sanguesa R, Cortes U. The Bayesian agent: An incremental approach for learning agents working under uncertainty, http://citeseer. nj. nec. com/464959. html. 2002.
10Wong T T, Hsu C N. Bayesian networks for Medicare expert systems, http://mist.med. org. tw/mist99/Proceeding-PDF/Microsoft% 20Word%20- %20chunnan111. 631107. pdf,2002.

引证文献1

1李静,骆斌,陈兆乾,陈世福.RoboCup中基于效果操作的动态行为规划模型[J].南京大学学报（自然科学版）,2003,39(5):467-475. 被引量：3

二级引证文献3

1谢从华,宋余庆,朱玉全,王立军.基于网格化的医学图像不规则特征提取方法[J].计算机工程与应用,2005,41(28):52-54. 被引量：2
2梁万杰,赵建民,朱信忠.RoboCup比赛机器人集成化技术初探[J].微型电脑应用,2008,24(1):21-24. 被引量：1
3杨佩,赵志宏,陈兆乾.NDSocTeam仿真机器人足球队的设计和实现[J].南京大学学报（自然科学版）,2003,39(5):451-458. 被引量：1

1方宝富,王浩,姚宏亮,杨静,高亮,万达.HfutEngine2005仿真机器人足球队设计[J].合肥工业大学学报（自然科学版）,2006,29(9):1085-1089. 被引量：2
2黄海滨.机器学习及其主要策略[J].河池师范高等专科学校学报,2000,20(4):85-89. 被引量：6
3林雄,于洪,孙志雄,韩建文.再励学习及其在移动机器人行为规划中的应用[J].工业控制计算机,2009,22(8):58-59.
4谷歌推出TensorFlow机器学习系统[J].电信工程技术与标准化,2015,28(11):92-92. 被引量：5
5柯立堃,程家兴.分层技术在机器人足球中的应用研究[J].计算机技术与发展,2007,17(3):73-76.
6尹绪森,吴甘沙.让机器学习突破大数据的重围[J].程序员,2013(11):113-117.
7赵沁平,魏华,王军玲.机器学习技术与机器学习系统[J].计算机科学,1993,20(5):27-40. 被引量：5
8洪炳熔,薄喜柱,韩学东.基于人工神经网络的足球机器人分层学习研究[J].计算机工程与应用,2001,37(23):75-77. 被引量：8
9何友鸣,方辉云.一种机器学习系统的设计与实现[J].计算机应用,2001,21(z1):160-162. 被引量：1
10殷翔,黄展翔.强化学习在仿真机器人足球踢球动作中的应用[J].苏州大学学报（工科版）,2002,22(4):26-32. 被引量：1

南京大学学报（自然科学版）

2003年第5期

浏览历史

内容加载中请稍等...

NDSocTeam仿真机器人足球队的设计和实现被引量：1

参考文献13

二级参考文献20

共引文献15

同被引文献17

引证文献1

二级引证文献3

相关作者

相关机构

相关主题

浏览历史

NDSocTeam仿真机器人足球队的设计和实现 被引量：1

参考文献13

二级参考文献20

共引文献15

同被引文献17

引证文献1

二级引证文献3

相关作者

相关机构

相关主题

浏览历史

NDSocTeam仿真机器人足球队的设计和实现被引量：1