摘要
对平均费用型马氏决策过程 ,研究了一种递阶增强型学习算法 ;并将算法应用于一个两台机器组成的开环可重入生产系统 ,计算机仿真结果表明 。
In this paper, a hierarchical reinforcement learning algorithm is investigated for Markov decision process with average cost. And it is applied to an open re-entrant manufacturing system composed of two machines as an example. Computer simulation results demonstrate that the algorithm outperforms some well-known heuristic scheduling policies.
出处
《系统工程理论与实践》
EI
CSCD
北大核心
2002年第5期76-80,102,共6页
Systems Engineering-Theory & Practice
基金
上海市自然科学基金 ( 0 1 ZD1 4 0 6 6 )