
基于分层式强化学习的移动机器人导航控制 被引量:2

Mobile Robot Navigation Control Based on Hierarchical Reinforcement Learning
摘要 针对未知环境下的移动机器人导航问题,本文提出了一种基于分层式强化学习的混合式控制方法。利用栅格-拓扑相结合的环境表示及地图学习方法,通过分层式强化学习在不同控制层次的扩展设计移动机器人的反应式和慎思式导航控制,实现了全局导航和局部导航控制的协调优化。实验及测试结果证明,该控制方法能实现导航任务的全局优化,避免陷入局部极小,并对未知动态环境具有较强的适应性。 According to the problem of mobile robot navigation in the unknown environment, a hybrid control method based on hierarchical reinforcement learning (HRL) is proposed. Considering the harmonization and optimization of global and local navigation control, the grid-topological map is learned for the environment representation. The grid topological map is learned for the environment representation to achieve the harmonization and optimization of global and local navigation control. Then reactive and deliberative navigation control of the mobile robot is implemented by extending HRL at different control levels. (1) Reactive control using flat reinforcement learning; (2) Global navigation control by extending reinforcement learning to qualitative state-action space based on topological analysis. Experimental results show that the method can optimize global navigation and avoid getting into local minimum. And it is adaptive to unknown dynamic environments.
出处 《南京航空航天大学学报》 EI CAS CSCD 北大核心 2006年第1期70-75,共6页 Journal of Nanjing University of Aeronautics & Astronautics
基金 国家自然科学基金(60575033)资助项目
关键词 分层式强化学习 栅格-拓扑地图 移动机器人 导航控制 hierarchical reinforcement learning grid-topological map mobile robot navigation contorl
  • 相关文献


  • 1Althaus P,Christensen H I.Smooth task switching through behaviour competition[J].Robotics and Autonomous Systems,2003,44(3/4):241-249.
  • 2Beom H R,Cho H S.A sensor-based navigation for a mobile robot using fuzzy logic and reinforcement learning[J].IEEE Trans on Systems,Man and Cybernetics,1995,25(3):464-477.
  • 3Ranganathan P,Hayet J B,Devy M,et al.Topological navigation and qualitative localization for indoor environment using multi-sensory perception[J].Robotics and Autonomous Systems,2002,41:137-144.
  • 4张纯刚,席裕庚.全局环境未知时基于滚动窗口的机器人路径规划[J].中国科学(E辑),2001,31(1):51-58. 被引量:77
  • 5Sutton R S,Barto A G.Reinforcement learning:an introduction[M].Cambridge,MA:MIT Press,1998:3-9.
  • 6Ye C,Yung N H C,Wang D W.A fuzzy controller with supervised learning assisted reinforcement learning algorithm for obstacle avoidance[J].IEEE Transactions on Systems,Man and Cybernetics,2003,33(1):17-27.
  • 7Kondo T,Ito K.A reinforcement learning with evolutionary state recruitment strategy for autonomous mobile robots control[J].Robotics and Autonomous Systems,2004,46(2):111-124.
  • 8Barto A G,Mahanevan S.Recent advances in hierarchical reinforcement learning[J].Discrete Event Dynamic Systems:Theory and Applications,2003,13(4):41-77.
  • 9Sutton R,Precup D,Singh S.Between MDPs and semi-MDPs:a framework for temporal abstraction in reinforcement learning[J].Artificial Intelligence,1999,112(1/2):181-211.
  • 10Dietterich T G.Hierarchical reinforcement learning with the MAXQ value function decomposition[J].Journal of Artificial Intelligence Research,2000,13(1):227-303.




  • 1Acar E U, Choset H, Zhang Y G,et al. Path planning for roboticdemining: Robust sensor-based coverage of unstructured envi-ronments and probabilistic methods [J]. International Journal ofRobotics Research, 2003, 22(7/8): 441-466.
  • 2Najjaran H, Kircanski N. Path planning for a terrain scannerrobot[C]//31st International Symposium on Robotics. Ottawa,Canada: Canadian Federation for Robotics, 2000: 132-137.
  • 3Zuo L C, Huang Y Y, Hall E L. Region filling operationswith random obstacle avoidance for mobile robots [J]. Journalof Robotic Systems, 1988, 5(2): 87-102.
  • 4Bosse M,Nourani-Vatani N, Roberts J. Coverage algorithms foran under-actuated car-like vehicle in an uncertain environment[C]//IEEE International Conference on Robotics and Automa-tion. Piscataway, USA: IEEE, 2007: 698-703.
  • 5Ollis M, Stentz A. Vision-based perception for an automatedharvester[C]//IEEE International Conference on Robotics andAutomation. Piscataway, USA: IEEE, 1997: 1838-1844.
  • 6Oksanen T, Visala A. Coverage path planning algorithms for a-gricultural field machines [J]. Journal of Field Robotics, 2009,26(8): 651-668.
  • 7Farsi M, Ratcliff K, Johnson P J, et al. Robot control systemfor window cleaning[C]//American Control Conference. Piscat-away, USA: IEEE, 1994; 994-995.
  • 8Jin X,Ray A. Coverage control of autonomous vehiclesfor oil spill cleaning in dynamic and uncertain environ-ments[C]//American Control Conference. Piscataway, USA:IEEE,2013: 2594-2599.
  • 9Jin X, Ray A. Navigation of autonomous vehicles for oil spillcleaning in dynamic and uncertain environments!;J]. Internation-al Journal of Control, 2014, 87(4): 787-801.
  • 10Song J N, Gupta S, Hare J, et al. Adaptive cleaning ofoil spills by autonomous vehicles under partial informa-tion[C]//MTS/IEEE Oceans Conference. Piscataway, USA:IEEE, 2013: 1-5.










使用帮助 返回顶部