期刊文献+
共找到1篇文章
< 1 >
每页显示 20 50 100
Optimal models with maximizing probability of first achieving target value in the preceding stages 被引量:1
1
作者 林元烈 伍从斌 康波大 《Science China Mathematics》 SCIE 2003年第3期396-414,共19页
Decision makers often face the need of performance guarantee with some sufficiently high proba-bility. Such problems can be modelled using a discrete time Markov decision process (MDP) with a probabilitycriterion for ... Decision makers often face the need of performance guarantee with some sufficiently high proba-bility. Such problems can be modelled using a discrete time Markov decision process (MDP) with a probabilitycriterion for the first achieving target value. The objective is to find a policy that maximizes the probabilityof the total discounted reward exceeding a target value in the preceding stages. We show that our formula-tion cannot be described by former models with standard criteria. We provide the properties of the objectivefunctions, optimal value functions and optimal policies. An algorithm for computing the optimal policies forthe finite horizon case is given. In this stochastic stopping model, we prove that there exists an optimal deter-ministic and stationary policy and the optimality equation has a unique solution. Using perturbation analysis,we approximate general models and prove the existence of ε-optimal policy for finite state space. We give anexample for the reliability of the satellite systems using the above theory. Finally, we extend these results tomore general cases. 展开更多
关键词 PROBABILITY criterion MARKOV decision processes minimizing risk FIRST achieving target value.
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部