期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
CONSTRAINED DENUMERABLE STATE NON-STATIONARY MDPs WITH EXPECTED TOTAL REWARD CRITERION
1
作者 郭先平 《Acta Mathematicae Applicatae Sinica》 SCIE CSCD 2000年第2期205-212,共8页
In this paper, we consider constrained denumerable state non-stationary Markov decision processes (MDPs, for short) with expected total reward criterion. By the mechanics of intro- ducing Lagrange multiplier and using... In this paper, we consider constrained denumerable state non-stationary Markov decision processes (MDPs, for short) with expected total reward criterion. By the mechanics of intro- ducing Lagrange multiplier and using the methods of probability and analytics, we prove the existence of constrained optimal policies. Moreover, we prove that a constrained optimal policy may be a Markov policy, or be a randomized Markov policy that randomizes between two Markov policies, that differ in only one state. 展开更多
关键词 Non-stationary MDPs expected total reward criterion constrained optimal policies
全文增补中
Asymptotic Evaluations of the Stability Index for a Markov Control Process with the Expected Total Discounted Reward Criterion
2
作者 Jaime Eduardo Martínez-Sánchez 《American Journal of Operations Research》 2021年第1期62-85,共24页
In this work, for a control consumption-investment process with the discounted reward optimization criteria, a numerical estimate of the stability index is made. Using explicit formulas for the optimal stationary poli... In this work, for a control consumption-investment process with the discounted reward optimization criteria, a numerical estimate of the stability index is made. Using explicit formulas for the optimal stationary policies and for the value functions, the stability index is explicitly calculated and through statistical techniques its asymptotic behavior is investigated (using numerical experiments) when the discount coefficient approaches 1. The results obtained define the conditions under which an approximate optimal stationary policy can be used to control the original process. 展开更多
关键词 Control Consumption-Investment Process Discrete-Time Markov Control Process expected total Discounted reward Probabilistic Metrics Stability Index Estimation
下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部