期刊文献+

报酬无界的平均准则马氏决策过程(英文)

Average Optimality in Markov Decision Processes with Unbounded Rewards
下载PDF
导出
摘要 本文对可数状态集、非空决策集、报酬无界的平均准则马氏决策过程,提出了一组新的条件,在此条件下存在(ε)最优平稳策略,且当最优不等式中的和有定义时最优不等式也成立. This paper studies average optimality in Markov decision processes with countable state space, nonempty action sets and unbounded reward function. New conditions are discussed under which there exists an (ε) optimal stationary policy, and that the average criterion optimality inequality holds when the summation in it is well defined.
作者 胡奇英
出处 《运筹学学报》 CSCD 北大核心 2002年第1期1-8,共8页 Operations Research Transactions
基金 The project was supported by the National Natural Science Foundation of China.
关键词 马氏决策过程 平均准则最优不等式 无界报酬 非空决策集 Markov decision process, average criterion optimality inequality, un-bounded rewards, nonempty action sets.
  • 相关文献

参考文献11

  • 1A.Araposthasis, V.S. Borkar,E.Fernandez-Gaucherand, M.K.Ghosh, and S.I.Marcus, Discrete-time controlled Markovprocesses with average cost criterion: a survey, SIAM J.Control Optim.,31(1993),282-334.
  • 2O.Hernandez-Lerma, and J.B.Lasserre, Weak conditions for average optimality inMarkov control processes, Sys. Contr. Lett.,22(1994),287-291.
  • 3Q.Hu, Discounted and average Markov decision processes with unbounded rewards: newconditions, J.Math. Anal. Appl.,171(1992),111-124.
  • 4Q.Hu and C.Xu, The Finiteness of the Reward Function and the Optimal Value Functionin Markov Decision Processes, Math. Methods in Oper. Res.,49(2)(1999),255-266.
  • 5S.A. Lippman, Semi-Markov decision processes with unbounded rewards, Mgt. Sci.,19(1973),717-731.
  • 6R.K. Ritt, and L.I. Sennott, Optimal stationary policies in general state spaceMarkov decision chains with finite action sets, Math. Oper. Res.,17(1992),901-909.
  • 7M. Schal, Average optimality in dynamic programming with general state space, Math.Oper. Res., 18 (1993), 163-172.
  • 8L.I. Sennott, Average cost optimal stationary policies in infinite state Markovdecision processes with unbounded costs., Oper. Res.,37(1989),626-633.
  • 9L.I. Sennott, Average cost semi-Markov decision processes and the control ofqueueing systems., Prob. Eng. Inform. Sci.,3 (1989),247-272.
  • 10L.I. Sennott, Another set of conditions for average optimality in Markov controlprocesses, Sys. Control Lett.,24(1995),147-151.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部