期刊文献+

一类用于井下路径规划问题的Dyna_Q学习算法 被引量:2

A Dyna_Q-learning Algorithm Used in Underground Path Planning
下载PDF
导出
摘要 在基于目标的强化学习任务中,欧氏距离常用于Dyna_Q学习的启发式规划中,但对于井下救援机器人路径规划这类状态空间在欧氏空间内不连续的任务效果不理想。针对该问题,文章引入流形学习中计算复杂度较低的拉普拉斯特征映射法,提出了一种基于流形距离度量的改进Dyna_Q学习算法,并在类似于井下环境的格子世界中进行了仿真研究。仿真结果验证了该算法的有效性。 The Euclidean distance is usually used in heuristic planning of Dyna_Q-learning based on reinforcement learning tasks of goal position. But it is not suitable for these tasks whose state space is not continuous in Euclidean space such as path planning of disaster rescue robot in underground coal mine. For the problem, the paper introduced the Laplacian Eigenmap whose computational complexity is lower in manifold learning, then proposed an improved Dyna_ Q-learning algorithm based on manifold distance metric. The proposed algorithm is simulated in grid world that is similar to underground environment. The simulation results verified validity of the algorithm.
出处 《工矿自动化》 北大核心 2012年第12期71-76,共6页 Journal Of Mine Automation
基金 国家自然科学基金资助项目(61273143) 中国矿业大学青年科技基金项目(OC080252)
关键词 Dyna_Q学习 欧氏距离 启发式规划 路径规划 拉普拉斯特征映射 流形距离 Dyna_ Q-learning, Euclidean distance, heuristic planning, path planning, LaplacianEigenmap, manifold distance
  • 相关文献

参考文献10

二级参考文献93

  • 1戴博,肖晓明,蔡自兴.移动机器人路径规划技术的研究现状与展望[J].控制工程,2005,12(3):198-202. 被引量:75
  • 2钱善华,葛世荣,王永胜,王勇,柳昌庆.救灾机器人的研究现状与煤矿救灾的应用[J].机器人,2006,28(3):350-354. 被引量:105
  • 3扫雷清障机器人[J].机器人技术与应用,1996(3):21-23. 被引量:2
  • 4MURPHY R. Rescue Robotics for Homeland Security [J]. Communications of the ACM, Special Issue on Homeland Security,2004,27(3) :66-69.
  • 5THRUN S, THAYER W, WHITTAKER C. Autonomous Exploration and Mapping of Abandoned Mines[J]. IEEE Robotics and Automation, 2005, 11(4) :13-28.
  • 6HIROSE S. Study on Roller-walk(Basic Characteristics and Its Control)[C]//Proc. ICRA of IEEE, 1996: 3265-3270.
  • 7YANG J M, KIM J H. Sliding Mode Control for Trajectory Tracking of Nonholonomic Wheeled Mobile Robot[J]. IEEE Transactions on Robotics and Automation, 1999,15 (3) :578-587.
  • 8[1]Khatib O.Real-time obstacle avoidance formanipulators and mobile robot[J].The InternationalJournal of Robotic Research.1986,5(1):90~98.
  • 9[2]M Gemeinder,M Gerke.GA-based Path Planning forRobot System Employing an Active Search Algorithm[J].Applied Soft Computing,2003.3:149~158.
  • 10[5]Sutton R S,Barto A G Reinforcement Learning:AnIntroduction[M].Cambridge,MA:MIT Press,1998.

共引文献444

同被引文献139

引证文献2

二级引证文献39

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部