摘要
为求解大状态空间的强化学习问题,提出了一种基于状态聚类的SARSA(λ)强化学习算法,其基本思想是利用先验知识或事先训练控制器,对状态空间进行聚类,分为不同的簇,然后在簇空间上进行SARSA(λ)学习。若能进行适当的状态聚类,算法将可得到一个相对好的近似值函数.
For solving large-scale reinforcement learning problem, a new SARSA(λ) algorithm of reinforcement learning based on states clustering is proposed. The principle idea of the algorithm is that it can first use the prior knowledge or train the controller to cluster the state space, the state space is lelustered to many clusters, then do SARSA(λ) learning in the cluster space. If the states are clustered properly, the algorithm can get a suitable approximate value function.
出处
《计算机工程》
CAS
CSCD
北大核心
2003年第5期37-38,98,共3页
Computer Engineering