期刊文献+

强化学习主要算法的研究 被引量:1

Study of the Main Reinforcement Learning Algorithms
下载PDF
导出
摘要 介绍了强化学习模型 ,分别提出了 7个主要的强化学习算法并讨论了它们之间的区别和联系 。 The model of reinforcement learning is first introduced in this paper ,Then the seven main algorithms including dynamic programming, Monte-Carlo method ,Temporal-Difference, Q-learning are given respectively and their difference and relation are pointed out .At last, future research direction are proposed.
作者 李瑞
出处 《渝西学院学报(自然科学版)》 2004年第3期22-25,共4页
关键词 强化学习 动态规划 蒙特卡罗算法 瞬时差分算法 reinforcement learning Dynamic Programming Monte-Carlo method Temporal-DiReinfo
  • 相关文献

参考文献5

  • 1[1]Kaelbling P. Leslie, Littman L. Michael, Moore W. Andrew. Reinforcement Learning: a survey[J]. Journal of Artificial Intelligence, 1996, (4) :237 - 285.
  • 2[2]Sutton R S, Barto A G. Reinforcement Learning: An Introduction[M]. MA:MIT Press,1998.
  • 3[3]Bellman R E. Dynamic Programing Princeton University Press. 1957.
  • 4[4]Sutton R S .Learning to predict by the methods of temporal difference[J].Mache Learning, 1998,(3):9- 44.
  • 5[5]Sutton R S. Temporal credit assignment in reinforcement learning[ M]. PhD thesis. University of Massachusetts, Amherst 1984.

同被引文献11

  • 1续爽,贾云得.一种基于意图跟踪和强化学习的agent模型[J].北京理工大学学报,2004,24(8):679-682. 被引量:3
  • 2EINSTein: An Artificial-Life Laboratory for Exploring Self-Organized Emergence in Land Combat[R]. Ilachinski, A. CNA Research Memorandum CRM D239, 2000.
  • 3Exploring Self-Organized Emergence in an Agent-Based Synthetic Warfare Lab. Dr. Andy Ilachinski [EB/OL]. http://www.cna.org.
  • 4Towards a Science of Experimental Complexity:An Artificial-Life Approach to Modeling Warfare. Andy Ilachinski [EB/OL].http://www.cna.org.
  • 5Irreducible Semi-Autonomous Adaptive Combat (ISAAC): An Artificial-Life Approach to Land Warfare [R]. Ilachinski, A. Center for Naval Analyses Research Memorandum CRM, 1997,97-61.
  • 6Operational Synthesis Applied to Mutual NZAJS Questions Part Ⅰ,Marine Corp Combat Development Command[Z].
  • 7Enhanced ISAAC Neural Simulation Toolkit (EINSTein), User's Guide [R]. Ilachinski, A,CNA, CIM 610.10,1999.
  • 8The Science of Complexity for Military Operations Research, W. O.Hedgepeth[J]. Phalanx, 26(1): 1993.
  • 9李宁,高阳,陆鑫,陈世福.一种基于强化学习的学习Agent[J].计算机研究与发展,2001,38(9):1051-1056. 被引量:26
  • 10陈卫东,席裕庚,顾冬雷.自主机器人的强化学习研究进展[J].机器人,2001,23(4):379-384. 被引量:16

引证文献1

二级引证文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部