摘要
本文讨论的是由可数状态空间,任意行动空间及半马氏决策矩阵所确定的半马氏决策过程的期望平均准则,在半马氏决策矩阵和报酬函数满足一定的条件下.用概率论中的稳定性定理,证明了ε(≥0)-强最优平稳策略的存在性.
This paper deals with the expected average criterion of semi-markov decision programming decided by denumerable state space arbitrary action space and semi-Markov decision matrix. Under certain conditions satisfied by the matrix and raward function, we prove tile existence of ε(≥0)-strongly optimal stationary policies by using tile stability theorem in probability theory.
出处
《长沙铁道学院学报》
CSCD
1995年第3期71-78,共8页
Journal of Changsha Railway University
基金
湖南省自然科学基金
关键词
马尔柯夫
决策规划
半马氏决策矩阵
平均准则
Markov Decision Programming
Semi-Markov Decision Matrix
Expected Average criterion