摘要
本文建立了一类无界向量值报酬折扣马氏决策规划,在一组无关向量生成的凸锥确定的序关系下,讨论了模型最优策略的存在性;给出强最优策略存在的必要充分条件;指出最优策略的自组合、凸组合策咯仍是最优策略;还证明了平稳策略在一般策略类中的优势。
In this paper, a discounted vector-valued Markovian decision model with unbounded rewards is investigated.The optimization,here,is made according to a partial-order Criterion determined by linearly independent vectors-generated convex cone.The existence of an optimal policy is proved .The problems of the intrinsic structures of some optimal policies are discussed. Necessary and sufficient conditions for the existence of strongly optimal policy is given. It is also shown that the convex combination policy and the self-combination policy of some optimal policies are optimal ,and that stationary policies possess a predominance in general policies
出处
《云南大学学报(自然科学版)》
CAS
CSCD
1993年第3期200-207,共8页
Journal of Yunnan University(Natural Sciences Edition)
关键词
无界报酬向量
马氏决策规划
Discounted Markovian Decision Programming, optimal policies, Unbounded vector-valued Reward