期刊文献+

小样本场景下的强化学习研究综述

Review of Research on Reinforcement Learning in Few-Shot Scenes
下载PDF
导出
摘要 根据小样本问题背景,将小样本场景分成两类,第一类场景追求更专业的性能,第二类场景追求更通用的性能.一般在知识泛化过程中,不同的场景对知识载体的需求有着明显的倾向性.针对小样本学习方法,以知识载体的角度,将其分为使用过程性知识的方法和使用陈述性知识的方法,再讨论该分类下的小样本强化学习算法.最后,从理论和应用等方面提出了可能的发展方向,以期为后续研究提供参考. According to the background of the few-shot problem, this paper divides few-shot scenes into two types. The first type of scenes pursues more professional performance, while the other pursues more general performance. In the process of knowledge generalization, different scenes have obvious tendency to the requirement of knowledge carrier. Because of the discovery, the FSL is divided into two types in terms of knowledge carrier, where one type uses procedural knowledge and the other uses declarative knowledge. Then FS-RL algorithms under this classification are discussed. Finally, the possible development direction is proposed from the theory and the application, hoping to provide insights to following research.
作者 王哲超 傅启明 陈建平 胡伏原 陆悠 吴宏杰 Wang Zhechao;Fu Qiming;Chen Jianping;Hu Fuyuan;Lu You;Wu Hongjie(School of Electronic and Information Engineering,Suzhou University of Science and Technology,Suzhou 215009,China;Jiangsu Provincial Key Laboratory of Building Intelligence and Energy Saving,Suzhou University of Science and Technology,Suzhou 215009,China;Suzhou Key Laboratory of Mobile Networking and Applied Technologies,Suzhou University of Science and Technology,Suzhou 215009,China)
出处 《南京师范大学学报(工程技术版)》 CAS 2022年第1期86-92,共7页 Journal of Nanjing Normal University(Engineering and Technology Edition)
基金 国家重点研发计划项目(2020YFC2006602) 国家自然科学基金项目(62072324、61876217、61876121、61772357、62073231、61902272) 江苏省重点研发计划项目(BE2017663)。
关键词 强化学习 小样本学习 元学习 迁移学习 终身学习 知识泛化 reinforcement learning few-shot learning meta-learning transfer learning lifelong learning knowledge generalization
  • 相关文献

参考文献5

二级参考文献134

  • 1Anderson J R. Cognitive Psychology and Its Applications(third edition) [M]. New York: Freeman, 1990.
  • 2Sutton R S, Barto A G. Reinforcement Learning [M]. Cambridge. MIT Press, 1998.
  • 3Bowling M, Veloso M. Reusing learned policies between similar problems[A]. Proceedings of AI* IA-98 Workshop on New Trends in Robotics [C]. Berlin, Germany: Springer Verlag. 1998.
  • 4Femandez F, Veloso M. Probabilistic policy reuse in a reinforcement learning agent[A]. Proceedings of the Fifth International Conference on Autonomous Agents and Multi-Agent Systems[C]. New York: ACM, 2006.
  • 5Femandez F, Veloso M. Policy reuse for transfer learning across tasks with different state and action spaces[A]. Proceedings of The ICML-06 Workshop on Structural Knowledge Transfer for Machine Learning[ C]. New York: ACM, 2006.
  • 6Bemstein D S. Reusing old policies to accelerate learning on new MDPs[ R]. Amherst: Amherst College, University of Massachusetts, 1999.
  • 7Pickett M, Barto A G. PolicyBlocks: an algorithm for creating useful macro-actions in reinforcement learning[ A]. Proceedings of the Nineteenth International Conference on Machine Learning [ C]. San Francisco: Morgan Kaufmann, 2002. 506 - 513.
  • 8Mcgovem A, Barto A G. Automatic discovery of subgoals in reinforcement learning using diverse density [ A ]. Proceedings of the Eighteenth International Conference on Machine Learning[ C]. San Francisco: Morgan Kaufmann, 2001. 361 - 368.
  • 9Dietterich T G. Hierarchical reinforcement learning with the MAXQ value function decomposition[ J]. Journal of Artificial Intelligence Research, 2000, 13 (2) : 227 - 303.
  • 10Mehta N, Natarajan S, Tadepalli P, A Fern. Transfer in vari-able-reward hierarchical reinforcement learning [ A ]. Proceedings of the NIPS-05 Workshop on Inductive Transfer [ C ]. Cambridge: MIT Press,2005.360 - 366.

共引文献82

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部