摘要
作为人工智能领域的常用技术之一,强化学习、监督学习和非监督学习并列为三种机器学习范式.强化学习是指智能体与环境直接进行交互,通过最大化累计奖励学习最佳策略的过程.本文梳理了强化学习领域的发展历程,并介绍了经典的强化学习和基于深度网络的强化学习算法和模型,包括基于值函数和策略梯度的方法、演员-评论家算法、深度Q网络及其优化模型,最后对强化学习当前面临的研究挑战和未来的发展前景进行了讨论.
As one of the commonly used technologies in the field of artificial intelligence,reinforcement learning is listed alongside supervised learning and unsupervised learning as one of the three main machine learning paradigms.Reinforcement learning involves an agent interacting directly with the environment to learn the best strategy by maximizing cumulative rewards.This paper reviews the development of reinforcement learning,introduces classical algorithms and models in both reinforcement learning and deep reinforcement learning,including value function-based methods,policy gradient methods,actor-critic algorithms,deep Q-networks,and their optimization models.At the end of this paper,current research challenges and future prospects of reinforcement learning are discussed.
作者
甘宁
李玉龙
汪永寿
乔琛
GAN Ning;LI Yulong;WANG Yongshou;QIAO Chen(School of Mathematics and Statistics,Xi′an Jiaotong University,Xi′an,Shaanxi 710049,China;China Tower Corporation Limited Qinghai Branch,Xining,Qinghai 810008,China)
出处
《数学建模及其应用》
2024年第3期1-14,共14页
Mathematical Modeling and Its Applications
基金
国家自然科学基金重大项目(12090021)
国家自然科学基金(12271429,12226007)
陕西省自然科学基础研究计划(2022JM-005)。
关键词
强化学习
机器学习
人工智能
reinforcement learning
machine learning
artificial intelligence