摘要
首先介绍了强化学习基本原理,分析了马尔科夫决策过程与半马尔科夫决策过程的理论基础及其在强化学习中的应用,其次阐述了分层强化学习中分层与抽象的思想,分析了HAM、Options与MaxQ等方法,并从分层与抽象角度进行了比较,最后指出了分层强化学习的研究发展方向。
Firstly, the principle of RL (reinforcement learning) was introduced and the theories and applications of MDP (Markov decision process) and SMDP for RL were analyzed. Secondly, the concepts of layer division and abstraction were demonstrated and the three HRL methods, including HAM, Options, MaxQ, were analyzed and compared from the aspects of layer division and abstraction. Finally the de- veloping directions of HRL were given.
出处
《广东石油化工学院学报》
2013年第4期30-33,52,共5页
Journal of Guangdong University of Petrochemical Technology
基金
国家自然科学基金项目(61272382)
广东省自然科学基金项目(8152500002000003
S2012010009963)
广东省高等学校科技创新项目(2012KJCX0077)
广东高校石化装备故障诊断与信息化控制工程中心项目(512009)
关键词
分层强化学习
半马尔科夫决策过程
抽象
收敛
学习
hierarchical reinforcement learning (HRL)
semi Markov decision process (SMDP)
abstraction
convergence
study