This paper is concerned with the convergence of a sequence of discrete-time Markov decision processes(DTMDPs)with constraints,state-action dependent discount factors,and possibly unbounded costs.Using the convex analy...This paper is concerned with the convergence of a sequence of discrete-time Markov decision processes(DTMDPs)with constraints,state-action dependent discount factors,and possibly unbounded costs.Using the convex analytic approach under mild conditions,we prove that the optimal values and optimal policies of the original DTMDPs converge to those of the"limit"one.Furthermore,we show that any countablestate DTMDP can be approximated by a sequence of finite-state DTMDPs,which are constructed using the truncation technique.Finally,we illustrate the approximation by solving a controlled queueing system numerically,and give the corresponding error bound of the approximation.展开更多
We study the Markov decision processes under the average-value-at-risk criterion.The state space and the action space are Borel spaces,the costs are admitted to be unbounded from above,and the discount factors are sta...We study the Markov decision processes under the average-value-at-risk criterion.The state space and the action space are Borel spaces,the costs are admitted to be unbounded from above,and the discount factors are state-action dependent.Under suitable conditions,we establish the existence of optimal deterministic stationary policies.Furthermore,we apply our main results to a cash-balance model.展开更多
This paper obtains the uniform estimate for maximum of sums of independent and heavy-tailed random variables with nonnegative random weights, which can be arbitrarily dependent of each other. Then the applications to ...This paper obtains the uniform estimate for maximum of sums of independent and heavy-tailed random variables with nonnegative random weights, which can be arbitrarily dependent of each other. Then the applications to ruin probabilities in a discrete time risk model with dependent stochastic returns are considered.展开更多
基金supported by National Natural Science Foundation of China (Grant Nos. 61374067 and 41271076)
文摘This paper is concerned with the convergence of a sequence of discrete-time Markov decision processes(DTMDPs)with constraints,state-action dependent discount factors,and possibly unbounded costs.Using the convex analytic approach under mild conditions,we prove that the optimal values and optimal policies of the original DTMDPs converge to those of the"limit"one.Furthermore,we show that any countablestate DTMDP can be approximated by a sequence of finite-state DTMDPs,which are constructed using the truncation technique.Finally,we illustrate the approximation by solving a controlled queueing system numerically,and give the corresponding error bound of the approximation.
基金supported by the National Natural Science Foundation of China(Grant Nos.61673019,11931018)the Natural Science Foundation of Guangdong Province(Grant Nos.2018A030313738,2021A1515010057)+1 种基金Guangdong Province Key Laboratory of Computational Science at the Sun Yat-sen University(2020B1212060032)IMR and RAE Research Fund,Faculty of Science,HKU.
文摘We study the Markov decision processes under the average-value-at-risk criterion.The state space and the action space are Borel spaces,the costs are admitted to be unbounded from above,and the discount factors are state-action dependent.Under suitable conditions,we establish the existence of optimal deterministic stationary policies.Furthermore,we apply our main results to a cash-balance model.
基金This work was supported by the National Natural Science Foundation of China(Grant Nos.70272001&10371117)The first author's work was also supported by China Postdoctoral Science Foundation(Grant No.2005037809) Foundation from the Youth Science and Technology of Uestc(Grant No.JX 03038).
文摘This paper obtains the uniform estimate for maximum of sums of independent and heavy-tailed random variables with nonnegative random weights, which can be arbitrarily dependent of each other. Then the applications to ruin probabilities in a discrete time risk model with dependent stochastic returns are considered.