This paper considers a first passage model for discounted semi-Markov decision processes with denumerable states and nonnegative costs. The criterion to be optimized is the expected discounted cost incurred during a f...This paper considers a first passage model for discounted semi-Markov decision processes with denumerable states and nonnegative costs. The criterion to be optimized is the expected discounted cost incurred during a first passage time to a given target set. We first construct a semi-Markov decision process under a given semi-Markov decision kernel and a policy. Then, we prove that the value function satisfies the optimality equation and there exists an optimal (or ε-optimal) stationary policy under suitable conditions by using a minimum nonnegative solution approach. Further we give some properties of optimal policies. In addition, a value iteration algorithm for computing the value function and optimal policies is developed and an example is given. Finally, it is showed that our model is an extension of the first passage models for both discrete-time and continuous-time Markov decision processes.展开更多
In this paper, we obtain the transition probability of jump chain of semi-Markov pro- cess, the distribution of sojourn time and one-dimensional distribution of semi-Markov process. Furthermore, the semi-Markov proces...In this paper, we obtain the transition probability of jump chain of semi-Markov pro- cess, the distribution of sojourn time and one-dimensional distribution of semi-Markov process. Furthermore, the semi-Markov process X(t, ω) is constructed from the semi-Markov matrix and it is proved that two definitions of semi-Markov process are equivalent.展开更多
基金Supported by the Natural Science Foundation of China(No.60874004,60736028)Guangdong Province Universities and Colleges Pearl River Scholar Funded Scheme(2010)
文摘This paper considers a first passage model for discounted semi-Markov decision processes with denumerable states and nonnegative costs. The criterion to be optimized is the expected discounted cost incurred during a first passage time to a given target set. We first construct a semi-Markov decision process under a given semi-Markov decision kernel and a policy. Then, we prove that the value function satisfies the optimality equation and there exists an optimal (or ε-optimal) stationary policy under suitable conditions by using a minimum nonnegative solution approach. Further we give some properties of optimal policies. In addition, a value iteration algorithm for computing the value function and optimal policies is developed and an example is given. Finally, it is showed that our model is an extension of the first passage models for both discrete-time and continuous-time Markov decision processes.
基金the National Natural Science Foundation of China (No. 60574002).
文摘In this paper, we obtain the transition probability of jump chain of semi-Markov pro- cess, the distribution of sojourn time and one-dimensional distribution of semi-Markov process. Furthermore, the semi-Markov process X(t, ω) is constructed from the semi-Markov matrix and it is proved that two definitions of semi-Markov process are equivalent.