期刊文献+

多机器人动态编队的强化学习算法研究 被引量:15

Research on Dynamic Team Formation of Multi-Robots Reinforcement Learning
下载PDF
导出
摘要 在人工智能领域中 ,强化学习理论由于其自学习性和自适应性的优点而得到了广泛关注 随着分布式人工智能中多智能体理论的不断发展 ,分布式强化学习算法逐渐成为研究的重点 首先介绍了强化学习的研究状况 ,然后以多机器人动态编队为研究模型 ,阐述应用分布式强化学习实现多机器人行为控制的方法 应用SOM神经网络对状态空间进行自主划分 ,以加快学习速度 ;应用BP神经网络实现强化学习 ,以增强系统的泛化能力 ;并且采用内、外两个强化信号兼顾机器人的个体利益及整体利益 为了明确控制任务 ,系统使用黑板通信方式进行分层控制 In the field of artificial intelligence, the reinforcement learning theory is receiving more and more attention with the advantage of its self learning and self adaptability With the development of the multi agent theory in distributed artificial intelligence, the distributed reinforcement learning is becoming the focus of this research In this paper, the research status of the reinforcement learning algorithm is illustrated first Then the multi robots' dynamic team formation is used as the study model to illuminate the hierarchical behavior control of the robots system with the usage of the reinforcement learning In the algorithm explained here, the SOM neural network is used to partition the state space automatically to speed up the learning rate The BP neural network is adopted to realize the reinforcement learning to strengthen the generalization ability The inside reinforcement signal and outside reinforcement signal are employed to represent the interest of the individual robot and the group robots respectively In order to define the task, the multi layer control and the blackboard communication are used in the system Finally, the simulation results are provided to show the validity of the algorithm
出处 《计算机研究与发展》 EI CSCD 北大核心 2003年第10期1444-1450,共7页 Journal of Computer Research and Development
基金 中国科学院沈阳自动化研究所机器人学重点实验室基金(RL2 0 0 10 6) 国防基础研究项目基金
关键词 多机器人 编队 强化学习 行为控制 multi robots team formation reinforcement learning behavior control
  • 相关文献

参考文献15

  • 1蔡庆生,张波.一种基于Agent团队的强化学习模型与应用研究[J].计算机研究与发展,2000,37(9):1087-1093. 被引量:31
  • 2张汝波,杨广铭,顾国昌,张国印.Q-学习及其在智能机器人局部路径规划中的应用研究[J].计算机研究与发展,1999,36(12):1430-1436. 被引量:17
  • 3陈卫东,董胜龙,席裕庚.基于开放式多智能体结构的分布式自主机器人系统[J].机器人,2001,23(1):45-50. 被引量:15
  • 4张汝波,顾国昌,刘照德,王醒策.强化学习理论、算法及应用[J].控制理论与应用,2000,17(5):637-642. 被引量:91
  • 5Leslie Laelbllng, Michael L Littman. Reinforcement learning: A survey. Journal of Artificial Intelligence Research, 1996, 4( 1 ) :237-285.
  • 6Ming Tan. Multi-agent reinforcement learning: Independence vs cooperation agents. In: Proc of the 10th Int'l Conf on Machine Learning. Amherst: University of Massachusetts, 1993. 330-337.
  • 7Junling Hu, Michael P WeUmen. Multi-agent reinforcement learning: Theoretical framework and an algorithm. The 15th Int'l Conf on Machine Learning, Madision Wisconsin, 1998.
  • 8Michael L Littman. Markov games as a framework for multiagent reinforcement learning. In: Proc of the llth Int'l Conf on Machine l.earning. San Francisco: Morgan-Kaufman, 1994. 157- 163.
  • 9Akihide Hiura. Cooperative behavior of various agents in dynamicenvironment. Journal of Computers and Industrial Engineering,1997, 33(3-4): 601-61M.
  • 10Tucker Balch, Ronald CArkin. Behavior-based formation control for multirobot teams. IEEE Trans on Robotics and Automation, 1998, 14(6) : 926-939.

二级参考文献14

  • 1杨璐,洪家荣,黄梯云.用加强学习方法解决基于神经网络的时序实时建模问题[J].哈尔滨工业大学学报,1996,28(4):136-139. 被引量:2
  • 2阎平凡.再励学习——原理、算法及其在智能控制中的应用[J].信息与控制,1996,25(1):28-34. 被引量:30
  • 3[1]Distributed Autonomous Robotic Systems(DARS'94). Edited by H Asama, T Fukuda T Arai and I Endo. Springer-Verlag, Tokyo, 1994
  • 4[2]Martin D L, Cheyer A J, Moran D B. The Open Agent Architecture: A Framework for Building Distributed Software systems. Applied Artificial Intelligence, 1999,13: 91-128
  • 5[3]Guzzoni D, Cheyer A J, Konolige K. Many Robots Make Short Work. AI Magazine, 1997,18(1): 55-64
  • 6[4]Balch T R, Arkin R C. Behavior-based Formation Control for Multiagent Robot Teams. IEEE Transactions on Robotics and Automation, December, 1998
  • 7[5]Arkin R C. Motor Schema Based Mobile Robot Navigation. International Journal of Robotics Research, 8(4): 92
  • 8[6]Brooks R A. A Robust Layered Control System for A Mobile Robot. IEEE Journal of Robotics & Automation, 1986,RA-2: 14-23
  • 9[7]Barry Brian Werger. Ayllu-Distributed Behavior-based Control for Pioneer Mobile Robots. ActivMedia Inc., 1998
  • 10俞星星,阎平凡.强化学习系统及其基于可靠度最优的学习算法[J].信息与控制,1997,26(5):332-339. 被引量:3

共引文献143

同被引文献328

引证文献15

二级引证文献118

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部