期刊文献+

自组织映射神经网络量化机器人强化学习方法研究 被引量:2

Research on Reinforcement Learning of the Robot Based on The SOM Neural Nebnwork Quantization
下载PDF
导出
摘要 强化学习一词来自于行为心理学 ,这门学科把行为学习看成反复试验的过程 ,从而把环境状态映射成相应的动作 .在设计智能机器人过程中 ,如何来实现行为主义的思想、在与环境的交互中学习行为动作 ?文中把机器人在未知环境中为躲避障碍所采取的动作看作一种行为 ,采用强化学习方法来实现智能机器人避碰行为学习 .为了提高机器人学习速度 ,在机器人局部路径规划中的状态空间量化就显得十分重要 .本文采用自组织映射网络的方法来进行空间的量化 .由于自组织映射网络本身所具有的自组织特性 ,使得它在进行空间量化时就能够较好地解决适应性灵活性问题 ,本文在对状态空间进行自组织量化的基础方法上 ,采用强化学习 .解决了机器人避碰行为的学习问题 。 The occept of the reinforcement learning comes from behavior psychology that takes behavior learning as trial and error ,by which the states of envirment are mapped into corresponding actiovs .There's a question of how dose the beheviorism be used to learn the actions in interactoion with the environment in destgning intelligent robot in this paper ,the actions that robot takes to avoid obstades are taken as one class of behaviors and the reinforcement learning is used to realize behavior learning of obstacle avoidance.The quantization of state space shows very important in improving robot's learning speed .The SOM neural network is adopted to get quantization of state space in this paper ,The self organization characteristic of SOM neural network makes it possible to solve adaptation and flexible in space quantization .The reinforcement learning is used to solve the robot learning of avoidence collision behavior based on quantization of state space and the satisfying results are got .
出处 《小型微型计算机系统》 CSCD 北大核心 2002年第5期558-560,共3页 Journal of Chinese Computer Systems
关键词 强化学习 自组织量化 神经网络 智能机器人 renforcement leaming intellingent robot self organization quantization neural networks
  • 相关文献

参考文献4

  • 1Thrum Sebastian ,Mitcheil Tom M.Lifelong robot leaning[J].Robotics and Autonomous System.1995,15:25~46
  • 2Ben J.A.Krose,Joris W.Mvan Dam.Adaptive state space quantisition,for reiforcement learning of collide free navigation[J].1922 IEEE/RSJ Internation Conference on Intelligent Robots and System.Rakeigh,NC.July 7~10 ,1992:1327~1332
  • 3Watking,J.C.Hand Dayan Peter.Q-leaming[J].Machine Learning.1992,8:279~292
  • 4阎平凡.再励学习——原理、算法及其在智能控制中的应用[J].信息与控制,1996,25(1):28-34. 被引量:30

二级参考文献6

  • 1Leslie Pack Kaelbling. Associative Reinforcement Learning: Functions in k-DNF[J] 1994,Machine Learning(3):279~298
  • 2Leslie Pack Kaelbling. Associative Reinforcement Learning: A Generate and Test Algorithm[J] 1994,Machine Learning(3):299~319
  • 3Leslie Pack Kaelbling. Associative reinforcement learning: Functions ink-DNF[J] 1994,Machine Learning(3):279~298
  • 4Ronald J. Williams. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning[J] 1992,Machine Learning(3-4):229~256
  • 5Christopher J.C.H. Watkins,Peter Dayan. Technical Note: Q-Learning[J] 1992,Machine Learning(3-4):279~292
  • 6Richard S. Sutton. Learning to predict by the methods of temporal differences[J] 1988,Machine Learning(1):9~44

共引文献29

同被引文献15

  • 1林联明,王浩,王一雄.基于神经网络的Sarsa强化学习算法[J].计算机技术与发展,2006,16(1):30-32. 被引量:4
  • 2Sutton R S,Barto A G.Reinforcement learning[M].MA:The MIT Press,1998.
  • 3Kaelbling L P, Littman M L, Moore A W. Reinforcement leaming:A survey[J].Journal of Artificial Intelligence Research, 1996,4(2):237-285.
  • 4Sutton R S.Learning to predit by the method of temporal differences[J].Machine Learing, 1988(3):9-44.
  • 5Watkins CJCH,Dayan P.Q-learning[J].Machine Learning,1992 (8):279-292.
  • 6Rummery G A,Niranjan M.On-line Q-learning using connectionist systems[R].Cambridge University Engineering Department, 1994.
  • 7Singh S P, Sutton R S.Reinforcement learning with replacing eligibility traces[J].Machine Learning, 1996,22:123-158.
  • 8Peng J, Williams R. Incremental multi-step Q-learning [J]. Machine Learning, 1996,22(4):283-290.
  • 9Andrew James Smith.Applications of the self-organising map to reinforcement learning [J]. Neural Networks, 2002,15 (8-9): 1107-1124.
  • 10Kohonen T.Self-organizing maps[C].Springer Series in Information Sciences.New York,USA:Springer,2001.

引证文献2

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部