摘要
强化学习是机器学习的一个重要分支,是人工智能领域的一大发展方向。本文讨论基于马尔可夫决策过程的强化学习基本框架,对强化学习基本模型进行分析,指出了强化学习的目标,对其中的理论推导进行拆解。文章从理论角度研究了深度强化学习的基础演员/评论家方法(actor-critic),探讨了确定性策略梯度方法(DPG)的内涵。文章分析了近几年效果良好的双延迟深度确定性策略梯度(TD3)学习方法。文章研究了现阶段强化学习的研究方向与典型方法。文章关注了强化学习的应用,从现阶段强化学习应用领域、强化学习可以处理的问题以及强化学习遇到的挑战等方面分析强化学习,剖析了强化学习的应用现状并对未来发展方向进行了预测。
Reinforcement Learning is an important branch of machine learning and a major development direction in the field of artificial intelligence. The article discusses the basic framework of Reinforcement Learning based on Markov Decision Process. The article analyzes the basic model, points out the goals and disassembles the theoretical derivation of Reinforcement Learning. The article analyzes actor-critic method from a theoretical perspective which is the basis of Deep Reinforcement Learning and talks about the insight of Deterministic Policy Gradient method. The article analyzes Twin Delayed Deep Deterministic policy gradient method that works well in recent years. The article studies the current research direction and typical methods of Reinforcement Learning. The article focuses on the application of Reinforcement Learning and analyzes the uses of Reinforcement Learning from an application perspective of Reinforcement Learning, problems that Reinforcement Learning can solve and the challenges that Reinforcement Learning faces. The article finally analyzes the application status of Reinforcement Learning and predicts the future of Reinforcement Learning.
出处
《计算机科学与应用》
2022年第3期554-564,共11页
Computer Science and Application