期刊文献+
共找到3篇文章
< 1 >
每页显示 20 50 100
Convergence of Markov decision processes with constraints and state-action dependent discount factors 被引量:2
1
作者 Xiao Wu Xianping Guo 《Science China Mathematics》 SCIE CSCD 2020年第1期167-182,共16页
This paper is concerned with the convergence of a sequence of discrete-time Markov decision processes(DTMDPs)with constraints,state-action dependent discount factors,and possibly unbounded costs.Using the convex analy... This paper is concerned with the convergence of a sequence of discrete-time Markov decision processes(DTMDPs)with constraints,state-action dependent discount factors,and possibly unbounded costs.Using the convex analytic approach under mild conditions,we prove that the optimal values and optimal policies of the original DTMDPs converge to those of the"limit"one.Furthermore,we show that any countablestate DTMDP can be approximated by a sequence of finite-state DTMDPs,which are constructed using the truncation technique.Finally,we illustrate the approximation by solving a controlled queueing system numerically,and give the corresponding error bound of the approximation. 展开更多
关键词 discrete-time Markov decision processes state-action dependent discount factors unbounded costs CONVERGENCE
原文传递
First passage Markov decision processes with constraints and varying discount factors 被引量:2
2
作者 Xiao WU Xiaolong ZOU Xianping GUO 《Frontiers of Mathematics in China》 SCIE CSCD 2015年第4期1005-1023,共19页
This paper focuses on the constrained optimality problem (COP) of first passage discrete-time Markov decision processes (DTMDPs) in denumerable state and compact Borel action spaces with multi-constraints, state-d... This paper focuses on the constrained optimality problem (COP) of first passage discrete-time Markov decision processes (DTMDPs) in denumerable state and compact Borel action spaces with multi-constraints, state-dependent discount factors, and possibly unbounded costs. By means of the properties of a so-called occupation measure of a policy, we show that the constrained optimality problem is equivalent to an (infinite-dimensional) linear programming on the set of occupation measures with some constraints, and thus prove the existence of an optimal policy under suitable conditions. Furthermore, using the equivalence between the constrained optimality problem and the linear programming, we obtain an exact form of an optimal policy for the case of finite states and actions. Finally, as an example, a controlled queueing system is given to illustrate our results. 展开更多
关键词 Discrete-time Markov decision process (DTMDP) constrainedoptimality varying discount factor unbounded cost
原文传递
TOTAL REWARD CRITERIA FOR UNCONSTRAINED/CONSTRAINED CONTINUOUS-TIME MARKOV DECISION PROCESSES
3
作者 Xianping GUO Lanlan ZHANG 《Journal of Systems Science & Complexity》 SCIE EI CSCD 2011年第3期491-505,共15页
This paper studies denumerable continuous-time Markov decision processes with expected total reward criteria. The authors first study the unconstrained model with possible unbounded transition rates, and give suitable... This paper studies denumerable continuous-time Markov decision processes with expected total reward criteria. The authors first study the unconstrained model with possible unbounded transition rates, and give suitable conditions on the controlled system's primitive data under which the authors show the existence of a solution to the total reward optimality equation and also the existence of an optimal stationary policy. Then, the authors impose a constraint on an expected total cost, and consider the associated constrained model. Basing on the results about the unconstrained model and using the Lagrange multipliers approach, the authors prove the existence of constrained-optimal policies under some additional conditions. Finally, the authors apply the results to controlled queueing systems. 展开更多
关键词 Constrained-optimal policy continuous-time Markov decision process optimal policy total reward criterion unbounded reward/cost and transition rates.
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部