CONTINUOUS TIME MARKOV DECISION PROGRAMMING WITH AVERAGE REWARD CRITERION AND UNBOUNDED REWARD RATE

CONTINUOUS TIME MARKOV DECISION PROGRAMMING WITH AVERAGE REWARD CRITERION AND UNBOUNDED REWARD RATE

导出

摘要 This paper deals with the continuous time Markov decision programming (briefly CTMDP) withunbounded reward rate.The economic criterion is the long-run average reward. To the models withcountable state space,and compact metric action sets,we present a set of sufficient conditions to ensurethe existence of the stationary optimal policies. This paper deals with the continuous time Markov decision programming (briefly CTMDP) withunbounded reward rate.The economic criterion is the long-run average reward. To the models withcountable state space,and compact metric action sets,we present a set of sufficient conditions to ensurethe existence of the stationary optimal policies.

作者郑少慧

机构地区 Shandong Mining Institutehis paper deals with the continuous time Markov decision programming (briefly CTMDP) withunbounded reward rate.The economic criterion is the long-run average reward. To the models withcountable state space

出处《Acta Mathematicae Applicatae Sinica》 SCIE CSCD 1991年第1期6-16,共11页 应用数学学报（英文版）

基金 This paper was prepared with the support of the National Youth Science Foundation

关键词 CONTINUOUS TIME MARKOV DECISION PROGRAMMING WITH AVERAGE REWARD CRITERION AND UNBOUNDED REWARD RATE CTMDP

分类号 O1 [理学—基础数学]

引文网络
相关文献

1刘建庸,刘克.MARKOVIAN DECISION PROGRAMMING WITH RECURSIVE VECTOR-REWARD[J].Acta Mathematicae Applicatae Sinica,1990,6(2):158-165.
2郭先平.CONSTRAINED DENUMERABLE STATE NON-STATIONARY MDPs WITH EXPECTED TOTAL REWARD CRITERION[J].Acta Mathematicae Applicatae Sinica,2000,16(2):205-212.
3董敏杰,梁泳梅.A Non-Parameter Decomposition Framework that Better Estimates Contributors to China＇s Economic Growth （1978-2010）[J].China Economist,2013,8(5):32-47.
4盛宏玉.A state space solution for the bending problem of thick laminated piezoelectric open cylindrical shells[J].Journal of Chongqing University,2009,8(2):125-132. 被引量：1
5刘建庸,刘克.MARKOV DECISION PROGRAMMING WITH CONSTRAINTS[J].Acta Mathematicae Applicatae Sinica,1994,10(1):1-11. 被引量：1
6刘建庸,胡奇英,王军明.连续时间马氏决策过程的基本假设[J].应用数学学报,2004,27(4):756-759.
7刘建庸,黄思明,胡光华.On discounted Markov decision programming with multi-vector constraints[J].Chinese Science Bulletin,1996,41(3):202-207.
8董泽清,张昇.ON THE PROPERTIES OF s(≥0) OPTIMAL POLICIES IN DISCOUNTED UNBOUNDED RETURN MODEL[J].Chinese Science Bulletin,1986,31(23):1651-1652.
9马军海,陈予恕.AN ANALYTIC AND APPLICATION TO STATE SPACE RECONSTRUCTION ABOUT CHAOTIC TIME SERIES[J].Applied Mathematics and Mechanics(English Edition),2000,21(11):1237-1245.
10柯铭涌.理想是给员工最好的激励[J].信息方略,2008(11):33-33.

Acta Mathematicae Applicatae Sinica

1991年第1期

浏览历史

内容加载中请稍等...

CONTINUOUS TIME MARKOV DECISION PROGRAMMING WITH AVERAGE REWARD CRITERION AND UNBOUNDED REWARD RATE

相关作者

相关机构

相关主题

浏览历史