期刊文献+
共找到1篇文章
< 1 >
每页显示 20 50 100
Markov decision processes associated with two threshold probability criteria
1
作者 Masahiko SAKAGUCHI Yoshio OHTSUBO 《控制理论与应用(英文版)》 EI CSCD 2013年第4期548-557,共10页
This paper deals with Markov decision processes with a target set for nonpositive rewards. Two types of threshold probability criteria are discussed. The first criterion is a probability that a total reward is not gre... This paper deals with Markov decision processes with a target set for nonpositive rewards. Two types of threshold probability criteria are discussed. The first criterion is a probability that a total reward is not greater than a given initial threshold value, and the second is a probability that the total reward is less than it. Our first (resp. second) optimizing problem is to minimize the first (resp. second) threshold probability. These problems suggest that the threshold value is a permissible level of the total reward to reach a goal (the target set), that is, we would reach this set over the level, if possible. For the both problems, we show that 1) the optimal threshold probability is a unique solution to an optimality equation, 2) there exists an optimal deterministic stationary policy, and 3) a value iteration and a policy space iteration are given. In addition, we prove that the first (resp. second) optimal threshold probability is a monotone increasing and right (resp. left) continuous function of the initial threshold value and propose a method to obtain an optimal policy and the optimal threshold probability in the first problem by using them in the second problem. 展开更多
关键词 Markov decision process minimizing risk model Threshold probability Policy space iteration
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部