RoboCup is a particularly ideal platform for studying the architecture of the multi-agent system, the multi-agent teamwork and machine learning methods. It has a great appeal to researchers in the artificial intelligence area. This paper mainly describes the infrastructure of NDSocTeam, the architecture of the agent and the realization of machine learning methods. Since the learning capability of the agent is critical to the robotic simulation team, we have designed the agent architecture focused on the machine learning aspect. First, we introduce an agent architecture in NDSocTeam that allows agents to decompose the task space. Since learning a mapping directly from agents' sensors to their actuators is intractable, the leaning tasks are hierarchically divided into four layers from the basic skill layer to the strategy layer. Different machine learning methods are applied to different layers, such as the neural network, reinforcement learning, C4.5, and soon. Second, we introduce the machine learning system in NDSocTeam that is featured with the layered learning and the combination of various learning methods. Given a hierarchical task decomposition, the layered learning allows learning at each level of the hierarchy. Third, a new reinforcement learning algorithm in NDSocTeam, reinforcement backward propagation algorithm (RBPA), is discussed. On the basis of the feed-backward neural network representing the value function, RBPA is used to exploit the most optimal policy. This is done because the state space is continuous and therefore has inherently lots of state-action pairs. Finally, established with the specific agent architecture and layered machine learning system, NDSocTeam is proved to have a desirable performance when competing with the former world champion, ATTCMUnited 2000.
Journal of Nanjing University(Natural Science)