摘要
本文讨论报酬为[1]中无界型的折扣马氏决策规划中的逐次逼近法,包括通常的逐次逼近法和有限状态逼近可数状态问题中的逐次逼近法,讨论了两者的收敛性和后者界的估计。
This paper investigated the successive approximating methods, including the usual case and the case in the problems of finite state approximation for denumerable state, in discounted Markov decision programming with unbounded rewards presented in [1]. The convergence of the both method and the bounds of the later are obtained.
基金
国家自然科学基金