This paper is concerned with the continuous-time Markov decision processes (MDP) having weak and strong interactions. Using a hierarchical approach, the state space of the underlying Markov chain can be decomposed int...This paper is concerned with the continuous-time Markov decision processes (MDP) having weak and strong interactions. Using a hierarchical approach, the state space of the underlying Markov chain can be decomposed into several groups of recurrent states and a group of transient states resulting in a singularly perturbed MDP formulation. Instead of solving the original problem directly, a limit problem that is much simpler to handle is derived. On the basis of the optical solution of the limit problem, nearly optimal decisions are constructed for the original problem. The asymptotic optimality of the constructed control is obtained; the rate of convergence is ascertained.展开更多
基金The research of this author is supported in part by the Office of Naval Research Grant N00014-96-1-0263.The research of this a
文摘This paper is concerned with the continuous-time Markov decision processes (MDP) having weak and strong interactions. Using a hierarchical approach, the state space of the underlying Markov chain can be decomposed into several groups of recurrent states and a group of transient states resulting in a singularly perturbed MDP formulation. Instead of solving the original problem directly, a limit problem that is much simpler to handle is derived. On the basis of the optical solution of the limit problem, nearly optimal decisions are constructed for the original problem. The asymptotic optimality of the constructed control is obtained; the rate of convergence is ascertained.