摘要
提出了一种新的分层强化学习(HRL)Option自动生成算法,以Agent在学习初始阶段探测到的状态空间为输入,并采用改进的蚁群聚类算法(ACCA)对其进行聚类,在聚类后的各状态子集上通过经验回放学习产生内部策略集,从而生成Option,仿真实验验证了该算法是有效的。
A new algorithm for Option automatic generation of hierarchical reinforcement learning is presented.The algorithm takes the state space explored by Agent as input in the initial learning phase and clusters the states employing Ant Colony Clustering Algorithm (ACCA).Based on the clustered state sets,the intra-strategies are learned by an experience replay procedure.As a resuh,the Options are generated.The validity of the algorithm is demonstrated by simulation experiments.
出处
《计算机工程与应用》
CSCD
北大核心
2008年第19期39-40,49,共3页
Computer Engineering and Applications
关键词
分层强化学习
OPTION
蚁群聚类算法
经验回放
hierarchical reinforcement learning
Option
Ant Colony Clustering Algorithm(ACCA)
experience replay