期刊文献+

Markov控制过程基于神经元动态规划的优化算法 被引量:1

Optimization Algorithms for Markov Control Processes Using Neuro-dynamic Programming
下载PDF
导出
摘要 论文在Markov性能势理论基础上 ,研究了Markov控制过程在神经元网络等逼近结构表示的随机平稳策略作用下的仿真优化算法 ;分析了它们在一个无限长的样本轨道上以概率 1的收敛性 ;并给出了一个三 Motivated by the needs of on line optimization of real word engineering systems, single sample path based optimization algorithms were studied for Markov control processes controlled by randomized stationary policies. The concept of Markov performance potential is introduced, and the policies can be represented by some approximate architectures such as neural networks. Unlike traditional computation based approaches, the policy parameters can be iterated and an optimal (or suboptimal) randomized stationary policy can be found according to a sample path obtained by observing the operation of a real system.This optimization method is a form of neuro dynamic programming methodology. The algorithms provided here have good adaptability as they can be used in different real systems, with a suitable choice of the parameters in the algorithms. Finally, the convergence of the algorithms with probability one on an infinite sample path is considered, and a numerical example for a three state controlled Markov chain is provided.
出处 《中国科学技术大学学报》 CAS CSCD 北大核心 2001年第5期549-557,共9页 JUSTC
基金 国家自然科学基金 (6 99740 37) 国家高性能计算基金 (0 0 2 0 8)资助项目
关键词 Markov性能势理论 MARKOV控制过程 随机平稳策略 样本轨道 神经元动态规划 随机决策问题 Markov performance potentials Markov control processes randomized stationary policies sample path
  • 相关文献

参考文献3

  • 1Cao X R,J Optim Theory Appl,1999年,100卷,3期,527页
  • 2Cao X R,Discrete Event Dynamic Systems:Theory and Applications,1998年,8卷,71页
  • 3Cao X R,IEEE Trans Control Syst Technol,1998年,6卷,4期,482页

同被引文献5

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部