期刊文献+

基于最先策略增强学习的ART2神经网络 被引量:3

Foremost-Policy Reinforcement Learning Based ART2 Neural Network
原文传递
导出
摘要 提出一种基于最先策略增强学习的 ART2神经网络 FPRL-ART2(Foremost-Policy Reinforcement Learn-ing based ART2 neuraI network),并介绍其学习算法.为了达到在线学习的目的.在 FPRL-ART2中,从状态到行为值之间的映射中,选择第一个得到奖励的行为,而不是选择诸如1-step Q-Learning 中具有最优行为值的行为.ART2神经网络用于存储分类模式,其权重通过增强学习增强或减弱,达到学习的目的.并将 FPRL-ART2运用到移动机器人避碰撞问题的研究中.仿真实验表明,引入 FPRL-ART2后减少移动机器人与障碍物发生碰撞的次数,具有良好的避碰效果. A foremost-policy reinforcement learning based ART2 neural network (FPRL-ART2) and its learning algorithm are proposed in this paper. To fit the requirement of real time learning, the first awarded behavior based on present states is selected in our Foremost-Policy Reinforcement Learning (FPRL) in stead of the optimal behavior in 1 step Q-Learning. The algorithm of FPRL is given and it is integrated with ART2 neural network. The stored weights of classified pattern in ART2 is increased or decreased by reinforcement learning. The FPRL-ART2 is successfully used in collision avoidance of mobile robot and the simulation experiment indicates that the times of collision between robot and obstacle is effectively decreased. The FPRL-ART2 makes favorable result of collision avoidance.
作者 樊建 吴耿锋
出处 《模式识别与人工智能》 EI CSCD 北大核心 2006年第3期428-432,共5页 Pattern Recognition and Artificial Intelligence
基金 上海市科学技术发展基金项目(No.015115042) 上海市教委第4期重点学科建设项目(No.B682)
关键词 增强学习 ART2神经网络 最先策略 避碰撞 Reinforcement I.earning , ART 2 Neural Network , Foremost - Policy , Collision Avoidance
  • 相关文献

参考文献11

  • 1Carpenter G A, Grossberg S. ART2: Stable Self-Organization of Category Recognition Codes for Analog Input Patterns. Applied Opries, 1987, 26(23) : 4919--4930
  • 2Liu X H, Yu Z Z, Dnan J, live Resonance Theory, In:et al. Face Recognition Using Adap-Proc of the International Conference on Machine Learning and Cybernetics. Xi'an, China, 2003, V:3167-3171
  • 3Fan J, Wu G F, et al. Reinforcement Learning and ART2 Neural Network Based Collision Avoidance System of Mobile Robot In: Yin F L, Wang J, Guo C G, eds. Lecture Notes in Computer Science. 2004, 3174: 35-40
  • 4黎明,严超华,刘高航.具有更严格警戒测试准则的ART2神经网络[J].中国图象图形学报(A辑),2001,6(1):81-85. 被引量:4
  • 5Whitehead S D, Sutton R S, Ballard D H. Advances in Rein foreement Learning and Their Implications for Intelligent Control. In: Proe of the 5th IEEE International Symposium on In telligent Control, Philadelphia, USA ,1990, Ⅱ: 1289--1297
  • 6Suwimonteerabuth D, Chongstitvatana P. Online Robot Learning by Reward and Punishment for a Mobile Robot. In; Proc of the IEEE/RSJ International Conference on Intelligent Robots and Systems. Lausanne, Swilzerland, 2002, Ⅰ : 921--926
  • 7Fnjimori A, Tani S. A Navigation of Mobile Robot with Collision Avoidance for Moving Obstacles. In: Proc of the IEEE International Conference on Industrial Technology. Bangkok, Thailand, 2002, Ⅰ: 1- 6
  • 8Grossberg S. Adaptive Pattern Classification and Universal Recoding, Ⅰ: Parallel Development and Coding of Neural Feature Delectors. Biological Cybernetics, 1976,23(3): 121- 134
  • 9Grossberg S. Adaptive Pattern Classification and Universal Recoding, II: Feedback, Expectation, Olfaetion, lllusions. Biological Cybernetics, 1976, 23(4): 187-202
  • 10Sution R S, Barto A G. Reinforcement Learning: An Introduction. Cambridge, USA: MIT Press, 1998

二级参考文献5

  • 1Carpenter G A,Grossberg S. ART2:Stable self- organization of category recognition codes for analog input patterns. Applied Optics, 1987,26.4919- 4930.
  • 2Baxter R. Error propagation and supervised learning in adaptive resonance networks. In,International Joint Conference on Neural Networks Ⅱ, Piscataway, NJ. USA, 1991,423-429.
  • 3Li F, Zhan J. Fuzzy adapting vigilance parameter of ART-Ⅱ neural nets. In: IEEE International Conference on Neural Networks, Orlando, FL, USA, 1994,3:1680-1685.
  • 4Huang J. Georgiopoulos M, Heileman G L. Fuzzy AET properties. Neural Networks. 1995,3:202-213.
  • 5Ming L, Paul S W. Personal identification by palm prints recognition. In:The 10th International FLAIRS Conference.Daytona Beach, Florida, USA, 1997,1211-1218

共引文献3

同被引文献7

  • 1李武军,王崇骏,张炜,陈世福.人脸识别研究综述[J].模式识别与人工智能,2006,19(1):58-66. 被引量:106
  • 2Zhao W, Chellappa R, Rosenfeld A, et al. Face Recognition:A Literature Survey[ J ]. ACM Computing Surveys ,2003,35 (4) :399 - 458.
  • 3Koji Kotani, Chen Qiu,Tadahiro Ohmi. Face Recognition Using Vector Quantization Histogram Method [ J ]. IEEE ICIP,2002 (9) : 105 - 108.
  • 4Phillips P J, Grother P, Micheals R J, et al. Face Recognition Vendor Test 2002 : Evaluation Report. 2003. http ://www. frvt. org/ FRVT2002/documents. htm.
  • 5Chen Lifen. A new LDA-based face recognition system which can solve the small sample size problem [ J ]. Pattern Recognition, 2000, 33 (10) :1713 - 1726.
  • 6AT&T Labaratory Cambridge. The ORL Database of Faces [ DB/OL ]. http ://www. uk. research. att. com/facedatabase. html.
  • 7Phil Picton. Neural Networks [ M ]. 2nd ed. Hampshire : Palgrave,2000 : 115 - 137.

引证文献3

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部