期刊文献+
共找到5篇文章
< 1 >
每页显示 20 50 100
Convergence of Markov decision processes with constraints and state-action dependent discount factors 被引量:2
1
作者 Xiao Wu Xianping Guo 《Science China Mathematics》 SCIE CSCD 2020年第1期167-182,共16页
This paper is concerned with the convergence of a sequence of discrete-time Markov decision processes(DTMDPs)with constraints,state-action dependent discount factors,and possibly unbounded costs.Using the convex analy... This paper is concerned with the convergence of a sequence of discrete-time Markov decision processes(DTMDPs)with constraints,state-action dependent discount factors,and possibly unbounded costs.Using the convex analytic approach under mild conditions,we prove that the optimal values and optimal policies of the original DTMDPs converge to those of the"limit"one.Furthermore,we show that any countablestate DTMDP can be approximated by a sequence of finite-state DTMDPs,which are constructed using the truncation technique.Finally,we illustrate the approximation by solving a controlled queueing system numerically,and give the corresponding error bound of the approximation. 展开更多
关键词 discrete-time Markov decision processes state-action dependent discount factors unbounded costs CONVERGENCE
原文传递
Polish空间上的折扣马氏过程量子化策略的渐近优化
2
作者 吴晓 孔荫莹 郭圳滨 《数学物理学报(A辑)》 CSCD 北大核心 2022年第2期594-604,共11页
该文研究了Polish空间上、带折扣因子的连续时间马尔可夫决策过程(CTMDPs)的量子化平稳策略的渐近最优性问题.首先,建立了折扣最优方程(DOE)及其解的存在性和唯一性.其次,在适当的条件下证明了最优确定性平稳策略的存在性.此外,为了对... 该文研究了Polish空间上、带折扣因子的连续时间马尔可夫决策过程(CTMDPs)的量子化平稳策略的渐近最优性问题.首先,建立了折扣最优方程(DOE)及其解的存在性和唯一性.其次,在适当的条件下证明了最优确定性平稳策略的存在性.此外,为了对行动空间进行离散化,构造了一列量子化策略,利用有限行动空间的策略来逼近一般(Polish)空间上的折扣CTMDPs最优平稳策略.最后,通过一个例子来说明该文的渐近逼近结果. 展开更多
关键词 连续时间马尔可夫决策过程 依赖状态折扣因子 折扣准则 量子化平稳策略 渐近最优性
下载PDF
可变折扣马氏决策过程首达模型列的收敛问题
3
作者 吴晓 郭圳滨 《应用概率统计》 CSCD 北大核心 2021年第6期598-610,共13页
本文主要研究了可数状态空间上带多约束、可变折扣马氏决策过程首达模型序列的收敛问题.利用``占有测度''及其相关性质,将受约束首达模型序列的优化问题转化为等价的受约束线性规划问题(凸分析方法),在合适条件下证明了首达模... 本文主要研究了可数状态空间上带多约束、可变折扣马氏决策过程首达模型序列的收敛问题.利用``占有测度''及其相关性质,将受约束首达模型序列的优化问题转化为等价的受约束线性规划问题(凸分析方法),在合适条件下证明了首达模型序列的最优值和最优策略收敛于``极限''模型的最优值和最优策略. 展开更多
关键词 马氏决策过程首达模型 多约束 依赖状态折扣因子 凸分析方法 收敛问题
下载PDF
An average-value-at-risk criterion for Markov decision processes with unbounded costs
4
作者 Qiuli LIU Wai-Ki CHING +1 位作者 Junyu ZHANG Hongchu WANG 《Frontiers of Mathematics in China》 SCIE CSCD 2022年第4期673-687,共15页
We study the Markov decision processes under the average-value-at-risk criterion.The state space and the action space are Borel spaces,the costs are admitted to be unbounded from above,and the discount factors are sta... We study the Markov decision processes under the average-value-at-risk criterion.The state space and the action space are Borel spaces,the costs are admitted to be unbounded from above,and the discount factors are state-action dependent.Under suitable conditions,we establish the existence of optimal deterministic stationary policies.Furthermore,we apply our main results to a cash-balance model. 展开更多
关键词 Markov decision processes average-value-at-risk(AVaR) state-action dependent discount factors optimal policy
原文传递
Uniform estimate for maximum of randomly weighted sums with applications to insurance risk theory 被引量:8
5
作者 WANG Dingcheng~1 SU Chun~2 & ZENG Yong~1 1. School of Management and School of Applied Mathematics,University of Electronic Science and Technology of China,Chengdu 610054,China 2. Department of Statistics and Finance,University of Science and Technology of China,Hefei 230026,China 《Science China Mathematics》 SCIE 2005年第10期1379-1394,共16页
This paper obtains the uniform estimate for maximum of sums of independent and heavy-tailed random variables with nonnegative random weights, which can be arbitrarily dependent of each other. Then the applications to ... This paper obtains the uniform estimate for maximum of sums of independent and heavy-tailed random variables with nonnegative random weights, which can be arbitrarily dependent of each other. Then the applications to ruin probabilities in a discrete time risk model with dependent stochastic returns are considered. 展开更多
关键词 dependent stochastic return discount factor heavy-tails discrete time INSURANCE risk model MAXIMA of randomly weighted sums RUIN probability tail probabilities UNIFORMLY asymptotic estimate.
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部