摘要
讨论非齐时离散时间马尔科夫决策规划折扣准则下方差最小最优策略.为此目的,首先讨论了非负损失时的最优策略;在报酬绝对平均相对有界或非负损失下,证明了方差最小最优策略问題等价于一个非负损失折扣马氏决策规划;给出了存在方差最小最优策略的充要条件和寻求此最优策略的有限阶段逼近.
In this paper we are discussing optimal polices with minimal variance for time nonhomogeneous dis- counted Markovian Decision Programming.In order to attain this purpose,first we discuss the problem of optimal polices for the model with nonnegative losses,then under the conditions that the absolute mean of rewords is relatively bounder or that the model is with nonnegative losses,we have proved that the problem of optimal policies with minimal variance equals to a Markovian decision programming with nonnegative losses.In the end,We give the necessary and sufficient conditions for the existence of optimal policies with minimal variance and the approximations by finite stages.
关键词
非齐时折扣
马氏决策规划
最优策略
Markovian Decision Programming with nonnegative losses
Optimal policies with minimal variance