摘要
机器人足球(RoboCup)是研究多agent系统的体系结构、多agent团队合作理论以及机器学习方法的理想测试平台.介绍了开发的仿真球队NDSocTeam系统的设计原理和实现技术.系统设计了以机器学习技术为核心的球员agent结构,并建立了一种分层学习以及多种学习技术相结合的机器学习系统.重点描述了NDSocTeam系统的总体结构、球员agent的结构以及机器学习的实现技术.
RoboCup is a particularly ideal platform for studying the architecture of the multi-agent system, the multi-agent teamwork and machine learning methods. It has a great appeal to researchers in the artificial intelligence area. This paper mainly describes the infrastructure of NDSocTeam, the architecture of the agent and the realization of machine learning methods. Since the learning capability of the agent is critical to the robotic simulation team, we have designed the agent architecture focused on the machine learning aspect. First, we introduce an agent architecture in NDSocTeam that allows agents to decompose the task space. Since learning a mapping directly from agents' sensors to their actuators is intractable, the leaning tasks are hierarchically divided into four layers from the basic skill layer to the strategy layer. Different machine learning methods are applied to different layers, such as the neural network, reinforcement learning, C4.5, and soon. Second, we introduce the machine learning system in NDSocTeam that is featured with the layered learning and the combination of various learning methods. Given a hierarchical task decomposition, the layered learning allows learning at each level of the hierarchy. Third, a new reinforcement learning algorithm in NDSocTeam, reinforcement backward propagation algorithm (RBPA), is discussed. On the basis of the feed-backward neural network representing the value function, RBPA is used to exploit the most optimal policy. This is done because the state space is continuous and therefore has inherently lots of state-action pairs. Finally, established with the specific agent architecture and layered machine learning system, NDSocTeam is proved to have a desirable performance when competing with the former world champion, ATTCMUnited 2000.
出处
《南京大学学报(自然科学版)》
CAS
CSCD
北大核心
2003年第5期451-458,共8页
Journal of Nanjing University(Natural Science)
基金
国家自然科学基金(699051001
60003010)