期刊文献+

STRONG N-DISCOUNT AND FINITE-HORIZON OPTIMALITY FOR CONTINUOUS-TIME MARKOV DECISION PROCESSES 被引量:1

STRONG N-DISCOUNT AND FINITE-HORIZON OPTIMALITY FOR CONTINUOUS-TIME MARKOV DECISION PROCESSES
原文传递
导出
摘要 This paper studies the strong n(n =—1,0)-discount and finite horizon criteria for continuoustime Markov decision processes in Polish spaces.The corresponding transition rates are allowed to be unbounded,and the reward rates may have neither upper nor lower bounds.Under mild conditions,the authors prove the existence of strong n(n =—1,0)-discount optimal stationary policies by developing two equivalence relations:One is between the standard expected average reward and strong—1-discount optimality,and the other is between the bias and strong 0-discount optimality.The authors also prove the existence of an optimal policy for a finite horizon control problem by developing an interesting characterization of a canonical triplet.
出处 《Journal of Systems Science & Complexity》 SCIE EI CSCD 2014年第5期1045-1063,共19页 系统科学与复杂性学报(英文版)
基金 supported by the National Natural Science Foundation of China under Grant Nos.61374080 and 61374067 the Natural Science Foundation of Zhejiang Province under Grant No.LY12F03010 the Natural Science Foundation of Ningbo under Grant No.2012A610032 Project Funded by the Priority Academic Program Development of Jiangsu Higher Education Institutions
关键词 Continuous-time Markov decision process expected average reward criterion finite-horizon optimality Polish space strong n-discount optimality 马尔可夫决策过程 连续时间 折扣 最优平稳策略 控制问题 最优策略 地平线 回报率
  • 相关文献

参考文献29

  • 1Hernndez-Lerma O and Lasserre J B, Discrete-Time Markov Control Processes: Basic Optimal- ity Criteria Springer, New York, 1996.
  • 2Hernndez-Lerma O and Lasserre J B, Further Topics on Discrete-Time Markov Control Pro- cesses, Springer, New York, 1999.
  • 3Puterman M L, Markov Decision Process Wiley New York, 1994.
  • 4Sennott L I, Stochastic Dynamic Programming and the Control of Queueing Systems, Wiley, New York, 1999.
  • 5Arapostathis A, Borkar V S, Fernndez-Gaucherand E, Ghosh M K, and Markus S I, Discrete- time controlled Markov processes with average cost criterion: A survey, SIAM J. Control Optim., 1993, 31(2): 282-344.
  • 6Guo X P and Rieder U, Average optimality for continuous-time Markov decision processes in Polish spaces, Ann. Appl. Probab., 2006, 16(2): 730-756.
  • 7Zhu Q X, Average optimality inequality for continuous-time Markov decision processes in Polish spaces, Math. Methods Oper. Res., 20(}7, 66(2): 299-313.
  • 8Zhu Q X, Average optimality for continuous-time Markov decision processes with a policy itera- tion approach, J. Math. Anal. Appl., 2008, 339(1): 691-704.
  • 9Hernndez-Lerma O, Vega-Amaya O, and Carrasco G, Sample-path optimality and variance- minimization of average cost Markov control processes, SIAM J. Control Optim., 1999, 38(1): 79-93.
  • 10Zhu Q X and Guo X P, Markov decision processes with variance minimization: A new condition and approach, Stoch. Anal. Appl., 2007, 25(3): 577-592.

同被引文献3

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部