期刊文献+

基于启发式强化学习的移动机器人路径规划算法研究 被引量:2

Research on Path Planning Algorithm of Mobile Robot Based on Heuristic Reinforcement Learning
下载PDF
导出
摘要 针对移动机器人采用强化学习方法进行路径规划时存在的学习效率低及收敛速度慢等问题,提出一种改进的Q-learning算法。首先提出动态动作集策略,根据机器人当前点与终点的位置来选择其动作集;然后在算法中加入启发式奖惩函数,使得机器人采取不同的动作收获不同的奖励。由此来改进算法,进而提高算法的学习效率,加快算法收敛。最后在栅格环境下进行仿真实验,结果表明本文改进算法较传统的Q-learning算法,很大程度上加快了算法的收敛速度。 Aiming at the problems of low learning efficiency and slow convergence speed in path planning of mobile robot us⁃ing Reinforcement Learning method,an improved Q-learning algorithm is proposed.Firstly,a dynamic action set strategy is pro⁃posed,which selects the action set according to the position of the robot's current point and end point;Then the heuristic reward and punishment function is added to the algorithm to make the robot take different actions and gain different rewards.Therefore,we can improve the algorithm,improve the learning efficiency of the algorithm and speed up the convergence of the algorithm.Finally,the simulation experiment is carried out in the grid environment.The results show that the improved algorithm greatly speeds up the convergence speed of the algorithm compared with the traditional Q-learning algorithm.
作者 潘国倩 周新志 Pan Guoqian;Zhou Xinzhi(College of Electronics and Information Engineering,Sichuan University,Chengdu 610065)
出处 《现代计算机》 2022年第10期57-61,共5页 Modern Computer
基金 中国民用航空局民航联合研究基金(U1933123)。
关键词 移动机器人 路径规划 Q-learning算法 mobile robot path planning Q-learning algorithm
  • 相关文献

参考文献5

二级参考文献21

  • 1邱育红.GIS空间分析中两种改进的路径规划算法[J].计算机系统应用,2007,16(7):33-35. 被引量:6
  • 2LEE D W,SEO S W,SIM K B. Online evolution for cooperative behavior in group robot systems[J].International Journal of Control,Automation,and Systems,2008,(02):282-287.
  • 3SCHAAL S,ATKESON C. Learning control in robotics[J].IEEE Robotics and Automation Magazine,2010,(03):20-29.
  • 4ANDERSEN K T,ZENG Y,CHRISTENSEN D D. Experiments with online reinforcement learning in real-time strategy games[J].Applied Artificial Intelligence,2009,(09):855-871.
  • 5GASKETT C. Q-Learning for Robot Control[D].Canberra:The Australian National University,2002.
  • 6DUNG L T,KOMEDA T,TAKAGI M. Reinforcement learning for pomdp using state classification[J].Applied Artificial Intelligence,2008,(07):761-779.
  • 7LIN L,XIE H,ZHANG D. Supervised neural Q-learning based motion control for bionic underwater robots[J].Journal of Bionic Engineering,2010,(Sup):177-184.
  • 8OH C H,NAKASHIMA T,ISHIBUCHI H. Initialization of qvalues by fuzzy rules for accelerating qlearning[A].Anchorage:IEEE,2002.2051-2056.
  • 9WIEWIORA E. Potential based shaping and qvaluc initialization are equivalent[J].Journal of Artificial Intelligence Research,2003,(01):208-208.
  • 10KHATIB O. Real-time obstacle avoidance for manipulators and mobile robots[A].St.Louis:IEEE,1985.500-505.

共引文献115

同被引文献33

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部