期刊文献+

Solution to reinforcement learning problems with artificial potential field 被引量:3

Solution to reinforcement learning problems with artificial potential field
下载PDF
导出
摘要 A novel method was designed to solve reinforcement learning problems with artificial potential field.Firstly a reinforcement learning problem was transferred to a path planning problem by using artificial potential field(APF),which was a very appropriate method to model a reinforcement learning problem.Secondly,a new APF algorithm was proposed to overcome the local minimum problem in the potential field methods with a virtual water-flow concept.The performance of this new method was tested by a gridworld problem named as key and door maze.The experimental results show that within 45 trials,good and deterministic policies are found in almost all simulations.In comparison with WIERING's HQ-learning system which needs 20 000 trials for stable solution,the proposed new method can obtain optimal and stable policy far more quickly than HQ-learning.Therefore,the new method is simple and effective to give an optimal solution to the reinforcement learning problem. A novel method was designed to solve reinforcement learning problems with artificial potential field. Firstly a reinforcement learning problem was transferred to a path planning problem by using artificial potential field(APF), which was a very appropriate method to model a reinforcement learning problem. Secondly, a new APF algorithm was proposed to overcome the local minimum problem in the potential field methods with a virtual water-flow concept. The performance of this new method was tested by a gridworld problem named as key and door maze. The experimental results show that within 45 trials, good and deterministic policies are found in almost all simulations. In comparison with WIERING's HQ-learning system which needs 20 000 trials for stable solution, the proposed new method can obtain optimal and stable policy far more quickly than HQ-learning. Therefore, the new method is simple and effective to give an optimal solution to the reinforcement learning problem.
出处 《Journal of Central South University of Technology》 EI 2008年第4期552-557,共6页 中南工业大学学报(英文版)
基金 Projects(30270496,60075019,60575012)supported by the National Natural Science Foundation of China
关键词 强化学习 计划 导航 电位 reinforcement learning path planning mobile robot navigation artificial potential field virtual water-flow
  • 相关文献

参考文献14

  • 1邹小兵,蔡自兴,孙国荣.Non-smooth environment modeling and global path planning for mobile robots[J].Journal of Central South University of Technology,2003,10(3):248-254. 被引量:6
  • 2祝晓才,董国华,蔡自兴,胡德文.Robust simultaneous tracking and stabilization of wheeled mobile robots not satisfying nonholonomic constraint[J].Journal of Central South University of Technology,2007,14(4):537-545. 被引量:5
  • 3文志强,蔡自兴.Global path planning approach based on ant colony optimization algorithm[J].Journal of Central South University of Technology,2006,13(6):707-712. 被引量:5
  • 4Andrew G. Barto,Sridhar Mahadevan.Recent Advances in Hierarchical Reinforcement Learning[J].Discrete Event Dynamic Systems (-).2003(1-2)
  • 5KAELBLING L P,,LITTMAN M L,MOORE A W.Reinforcement learning:A survey[].Journal of Artificial Organs.1996
  • 6SUTTON R S,BARTO A.Reinforcement learning:An introduction[]..1998
  • 7BANERJEE B,STONE P.General game learning using knowledge transfer[].Proceedings of theth International Joint Conference on Artificial Intelligence.2007
  • 8ASADI M,HUBER M.Effective control knowledge transfer through learning skill and representation hierarchies[].Proceedings of theth International Joint Conference on Artificial Intelligence.2007
  • 9KONIDARIS G,BARTO A.Autonomous shaping:Knowledge transfer in reinforcement learning[].Proceedings of therd International Conference on Machine Learning.2006
  • 10MEHTA N,NATARAJAN S,TADEPALLI P,FERN A.Transfer in variable-reward hierarchical reinforcement learning[].Workshop on Transfer Learning at Neural Information Processing Systems.2005

二级参考文献14

  • 1金飞虎,洪炳熔,高庆吉.基于蚁群算法的自由飞行空间机器人路径规划[J].机器人,2002,24(6):526-529. 被引量:52
  • 2朱庆保,张玉兰.基于栅格法的机器人路径规划蚁群算法[J].机器人,2005,27(2):132-136. 被引量:120
  • 3朱庆保.动态复杂环境下的机器人路径规划蚂蚁预测算法[J].计算机学报,2005,28(11):1898-1906. 被引量:50
  • 4BROCKETT R W.Asymptotic stability and feedback stabilization[].Differential Geometric Control Theory.1983
  • 5LUCA A D,ORIOLO G,VENDITTELLI M.Control of wheeled mobile robots: An experimental overview[].RAMSETE—Articulated and Mobile Robotics for Services and Technologies.2001
  • 6ORIOLO G,LUCA A D,VENDITTELLI M.WMR control via dynamic feedback linearization: Design,implementation and experimental validation[].IEEE Transactions on Control Systems Technology.2002
  • 7DO K D,JIANG Z P,PAN J.A global output-feedback controller for simultaneous tracking and stabilization of unicycle-type mobile robots[].IEEE Transactions on Robotics.2004
  • 8de CANUDAS WIT C,KHENNOUF H.Quasi-continuous stabilizing controllers for nonholonomic systems: Design and robustness considerations[].Proc rd Euro Contr Conf Rome.1995
  • 9D’ANDREA-NOVEL B,CAMPION G,BASTIN G.Control of wheeled mobile robots not satisfying ideal velocity constraints: A singular perturbation approach[].International Journal of Robust and Nonlinear Control.1995
  • 10LEROQUAIS W,D’ANDREA-NOVEL B.Modeling and control of wheeled mobile robots not satisfying ideal velocity constraints: the unicycle case[].Proc th Conf Decision Control.1996

共引文献13

同被引文献30

  • 1乔俊飞,侯占军,阮晓钢.基于神经网络的强化学习在避障中的应用[J].清华大学学报(自然科学版),2008,48(S2):1747-1750. 被引量:27
  • 2王芳,万磊,徐玉如,张玉奎.基于改进人工势场的水下机器人路径规划[J].华中科技大学学报(自然科学版),2011,39(S2):184-187. 被引量:15
  • 3李伟,何雪松,叶庆泰,朱昌明.基于先验知识的强化学习系统[J].上海交通大学学报,2004,38(8):1362-1365. 被引量:4
  • 4Szczerba R J , Galkowski P,Glickstein I S, et al. Ro-bust algorithm for real-time route planning [J]. IEEETransactions on Aerospace and Electronic System, 2000,36(3): 869-878.
  • 5Watkins P D. Q-learning [J]. Machine Learning,1992,8(3) : 279-292.
  • 6Ng A Y. Shaping and policy search in reinforcementlearning [D]. Berkeley: University of California, 2003.
  • 7DUDEK G, JENKIN M. Computational principles of mobile robotics [M]. Cambridge University Press, 2010: 80-105.
  • 8KIM Y J, KIM J H, KWON D S. Evolutionary programming-based univector field navigation method for fast mobile robots [J]. IEEE Trans on Systems, Man, and Cybernetics B, 2001, 31(3): 450-458.
  • 9PARK K H, KJM Y J, KIM J H. Modular Q-Iearning based multi-agent cooperation for robot soccer [J]. Robotics and Autonomous Systems, 2001, 35(2): 109-122.
  • 10WATKJNS C. Learning from delayed rewards [D]. London: King's College, 1989.

引证文献3

二级引证文献17

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部