基于分层式强化学习的移动机器人导航控制被引量：2

Mobile Robot Navigation Control Based on Hierarchical Reinforcement Learning

下载PDF

导出

摘要针对未知环境下的移动机器人导航问题,本文提出了一种基于分层式强化学习的混合式控制方法。利用栅格-拓扑相结合的环境表示及地图学习方法,通过分层式强化学习在不同控制层次的扩展设计移动机器人的反应式和慎思式导航控制,实现了全局导航和局部导航控制的协调优化。实验及测试结果证明,该控制方法能实现导航任务的全局优化,避免陷入局部极小,并对未知动态环境具有较强的适应性。 According to the problem of mobile robot navigation in the unknown environment, a hybrid control method based on hierarchical reinforcement learning （HRL） is proposed. Considering the harmonization and optimization of global and local navigation control, the grid-topological map is learned for the environment representation. The grid topological map is learned for the environment representation to achieve the harmonization and optimization of global and local navigation control. Then reactive and deliberative navigation control of the mobile robot is implemented by extending HRL at different control levels. （1） Reactive control using flat reinforcement learning; （2） Global navigation control by extending reinforcement learning to qualitative state-action space based on topological analysis. Experimental results show that the method can optimize global navigation and avoid getting into local minimum. And it is adaptive to unknown dynamic environments.

作者陈春林陈宗海卓睿周光明

机构地区中国科学技术大学自动化系

出处《南京航空航天大学学报》 EI CAS CSCD 北大核心 2006年第1期70-75,共6页 Journal of Nanjing University of Aeronautics & Astronautics

基金国家自然科学基金(60575033)资助项目

关键词分层式强化学习栅格-拓扑地图移动机器人导航控制 hierarchical reinforcement learning grid-topological map mobile robot navigation contorl

分类号 TP24 [自动化与计算机技术—检测技术与自动化装置] TP18 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

参考文献14

1Althaus P,Christensen H I.Smooth task switching through behaviour competition[J].Robotics and Autonomous Systems,2003,44(3/4):241-249.
2Beom H R,Cho H S.A sensor-based navigation for a mobile robot using fuzzy logic and reinforcement learning[J].IEEE Trans on Systems,Man and Cybernetics,1995,25(3):464-477.
3Ranganathan P,Hayet J B,Devy M,et al.Topological navigation and qualitative localization for indoor environment using multi-sensory perception[J].Robotics and Autonomous Systems,2002,41:137-144.
4张纯刚,席裕庚.全局环境未知时基于滚动窗口的机器人路径规划[J].中国科学（E辑）,2001,31(1):51-58. 被引量：77
5Sutton R S,Barto A G.Reinforcement learning:an introduction[M].Cambridge,MA:MIT Press,1998:3-9.
6Ye C,Yung N H C,Wang D W.A fuzzy controller with supervised learning assisted reinforcement learning algorithm for obstacle avoidance[J].IEEE Transactions on Systems,Man and Cybernetics,2003,33(1):17-27.
7Kondo T,Ito K.A reinforcement learning with evolutionary state recruitment strategy for autonomous mobile robots control[J].Robotics and Autonomous Systems,2004,46(2):111-124.
8Barto A G,Mahanevan S.Recent advances in hierarchical reinforcement learning[J].Discrete Event Dynamic Systems:Theory and Applications,2003,13(4):41-77.
9Sutton R,Precup D,Singh S.Between MDPs and semi-MDPs:a framework for temporal abstraction in reinforcement learning[J].Artificial Intelligence,1999,112(1/2):181-211.
10Dietterich T G.Hierarchical reinforcement learning with the MAXQ value function decomposition[J].Journal of Artificial Intelligence Research,2000,13(1):227-303.

二级参考文献3

1席裕庚，预测控制，1993年
2Tilove R B，Proc IEEE Conf Robotics and Automation Nice，1990年，566页
3席裕庚.动态不确定环境下广义控制问题的预测控制[J].控制理论与应用,2000,17(5):665-670. 被引量：71

共引文献76

1杜军君,席裕庚.一种改进的机器人滚动路径规划算法[J].控制工程,2006,13(S1):183-186. 被引量：1
2于魁龙,贾小平,曹有辉,朱大力.基于混合算法的局部路径规划[J].装甲兵工程学院学报,2008,22(2):43-45. 被引量：6
3席裕庚.注重控制科学的方法论研究[J].自动化学报,2002,28(S1):85-91. 被引量：4
4孟伟,洪炳镕,韩学东.基于场景匹配的移动机器人避障[J].控制与决策,2004,19(8):889-892.
5武虎,李少远.基于局部信息的滚动优化与机器人路径规划[J].系统仿真学报,2004,16(8):1680-1682. 被引量：6
6胡世亮,席裕庚.一种通用的移动机器人路径规划仿真系统[J].系统仿真学报,2004,16(8):1714-1716. 被引量：13
7王一可,席裕庚.一种多移动机器人的分布式滚动路径规划算法[J].微型电脑应用,2004,20(11):8-9.
8龙飞,孙德宝,秦元庆.动态环境下机器人路径规划的一种新方法[J].计算机与数字工程,2005,33(2):31-35. 被引量：2
9戴博,肖晓明,蔡自兴.移动机器人路径规划技术的研究现状与展望[J].控制工程,2005,12(3):198-202. 被引量：75
10王强,姚进,王进戈.基于滚动窗口的足球机器人传球路径搜索[J].哈尔滨工业大学学报,2005,37(7):936-939. 被引量：4

同被引文献25

1Acar E U, Choset H, Zhang Y G,et al. Path planning for roboticdemining: Robust sensor-based coverage of unstructured envi-ronments and probabilistic methods [J]. International Journal ofRobotics Research, 2003, 22(7/8): 441-466.
2Najjaran H, Kircanski N. Path planning for a terrain scannerrobot[C]//31st International Symposium on Robotics. Ottawa,Canada: Canadian Federation for Robotics, 2000: 132-137.
3Zuo L C, Huang Y Y, Hall E L. Region filling operationswith random obstacle avoidance for mobile robots [J]. Journalof Robotic Systems, 1988, 5(2): 87-102.
4Bosse M,Nourani-Vatani N, Roberts J. Coverage algorithms foran under-actuated car-like vehicle in an uncertain environment[C]//IEEE International Conference on Robotics and Automa-tion. Piscataway, USA: IEEE, 2007: 698-703.
5Ollis M, Stentz A. Vision-based perception for an automatedharvester[C]//IEEE International Conference on Robotics andAutomation. Piscataway, USA: IEEE, 1997: 1838-1844.
6Oksanen T, Visala A. Coverage path planning algorithms for a-gricultural field machines [J]. Journal of Field Robotics, 2009,26(8): 651-668.
7Farsi M, Ratcliff K, Johnson P J, et al. Robot control systemfor window cleaning[C]//American Control Conference. Piscat-away, USA: IEEE, 1994; 994-995.
8Jin X,Ray A. Coverage control of autonomous vehiclesfor oil spill cleaning in dynamic and uncertain environ-ments[C]//American Control Conference. Piscataway, USA:IEEE,2013: 2594-2599.
9Jin X, Ray A. Navigation of autonomous vehicles for oil spillcleaning in dynamic and uncertain environments!;J]. Internation-al Journal of Control, 2014, 87(4): 787-801.
10Song J N, Gupta S, Hare J, et al. Adaptive cleaning ofoil spills by autonomous vehicles under partial informa-tion[C]//MTS/IEEE Oceans Conference. Piscataway, USA:IEEE, 2013: 1-5.

引证文献2

1孙建,陈宗海,王鹏,张启彬,包鹏.基于代价地图和最小树的移动机器人多区域覆盖方法[J].机器人,2015,37(4):435-442. 被引量：5
2王童,李骜,宋海荦,刘伟,王明会.基于分层深度强化学习的移动机器人导航方法[J].控制与决策,2022,37(11):2799-2807. 被引量：7

二级引证文献12

1张启彬,王鹏,陈宗海.基于速度空间的移动机器人同时避障和轨迹跟踪方法[J].控制与决策,2017,32(2):358-362. 被引量：16
2赵健,张阳.基于典型栅格地图的代价地图改进方法[J].机械与电子,2018,36(12):73-76. 被引量：1
3周林娜,汪芸,张鑫,杨春雨.矿区废弃地移动机器人全覆盖路径规划[J].工程科学学报,2020,42(9):1220-1228. 被引量：18
4蒋林,张燕飞,马先重,朱建阳,雷斌.二次区域划分的全覆盖路径规划[J].哈尔滨工程大学学报,2022,43(10):1483-1490. 被引量：4
5王欢,周旭,邓亦敏,刘小峰.分层决策多机空战对抗方法[J].中国科学：信息科学,2022,52(12):2225-2238. 被引量：3
6查荣瑞,马云华,燕翔,郑霜.基于场景理解与改进型BUG算法的移动机器人避障[J].计算机测量与控制,2023,31(3):228-234. 被引量：2
7杨秀霞,王晨蕾,张毅,于浩,姜子劼.基于增量式发育深度强化学习的无人机路径规划[J].飞行力学,2023,41(3):40-46. 被引量：2
8郎宾超.多航道下船舶最佳导航路线选择方法[J].舰船科学技术,2023,45(11):151-154.
9张时进.基于深度强化学习的红外单目摄像头移动机器人避障方法[J].信息与电脑,2023,35(11):195-197.
10王磊,胡国,吴海,谭阔,周成,朱亚军.基于分层深度强化学习的分布式能源系统多能协同优化方法[J].电力系统自动化,2024,48(1):67-76.

1朱齐丹,华克强.智能水下机器人路径规划[J].哈尔滨工程大学学报,1996,17(4):37-41. 被引量：1
2罗民,曹清,张舸.三轴台式工业机器人通用控制软件设计[J].现代电子技术,2013,36(23):133-135.
3李继荣.基于任务的高级辅助驾驶系统的环境表示和体系结构[J].消费电子,2012(08X):113-113.
4白敏丹,韩红桂,乔俊飞.基于遗传算法的污水处理模糊控制方法[J].控制工程,2009,16(1):46-48. 被引量：15
5刘涛,何卫平,雷蕾.基于多摄像头的AGV全局导航地图创建[J].计算机与数字工程,2013,41(4):645-648.
6葛陵元,胡湘陵.计算机数据安全的控制层次[J].新浪潮,1989(3):59-61.
7张越,徐小燕.地铁综合自动化系统的研究[J].南京林业大学学报（自然科学版）,2007,31(5):97-100. 被引量：14
8沈晶,顾国昌,刘海波.未知动态环境中基于分层强化学习的移动机器人路径规划[J].机器人,2006,28(5):544-547. 被引量：15
9张红强,章兢,周少武,曾照福,吴亮红.未知动态环境下非完整移动群机器人围捕[J].控制理论与应用,2014,31(9):1151-1165. 被引量：8
10郭恒业,张田文,解凯.基于图象建模技术的综述[J].系统仿真学报,2001,13(S2):36-38. 被引量：4

南京航空航天大学学报

2006年第1期

浏览历史

内容加载中请稍等...

基于分层式强化学习的移动机器人导航控制被引量：2

参考文献14

二级参考文献3

共引文献76

同被引文献25

引证文献2

二级引证文献12

相关作者

相关机构

相关主题

浏览历史

基于分层式强化学习的移动机器人导航控制 被引量：2

参考文献14

二级参考文献3

共引文献76

同被引文献25

引证文献2

二级引证文献12

相关作者

相关机构

相关主题

浏览历史

基于分层式强化学习的移动机器人导航控制被引量：2