期刊文献+

Markov控制过程基于性能势仿真的并行优化 被引量:1

Parallel Optimization for Markov Control Processes Based on Performance Potentials Simulation
下载PDF
导出
摘要 Markov控制过程是研究随机离散事件动态系统性能优化问题的一个重要模型,并在许多实际工程问题中有着广泛的应用。在Markov性能势理论的基础上,我们讨论了一类连续时间Markov控制过程在紧致行动集上的性能优化仿真问题。由于实际系统的状态空间往往非常巨大,通常的串行仿真算法,可能耗时过长,也可能由于硬件限制而无法实现,故我们提出了一种基于性能势的并行仿真优化算法,来寻找系统的最优平稳策略。一个仿真实例表明该算法有较好的运行效率。该算法可应用于大规模实际系统的性能优化。 A Markov control process is an important model for performance optimization in stochastic discrete event dynamic systems, and is widely used in many practical engineering problems. Based on the theory of Markov performance potential, the problems of performance optimization simulation for a class of continuous-time Markov control processes are studied. Since the state space of an actual system is often very large, when applying traditional serial simulation algorithms, long time is possibly spent, or it is impossibly realized because of hardware. A parallel simulation optimization algorithm based on performance potentials is proposed to find the optimal stationary policy of a system. A simulation example shows that the algorithm can achieve high speedup. The algorithm can be used in optimization for large-scale practical systems.
出处 《系统仿真学报》 CAS CSCD 2003年第11期1574-1576,共3页 Journal of System Simulation
基金 国家自然科学基金(69974037) 安徽省自然科学基金(01042308)
关键词 性能势 并行仿真算法 连续时间Markov控制过程 紧致行动集 performance potential parallel simulation algorithm continuous-time Markov control process compact action set
  • 相关文献

参考文献3

二级参考文献9

  • 1Cao X R,IEEE Trans Automat Control,1997年,42卷,1382页
  • 2Dai Liyi,Tan Zizhonged.Proceedings of CWCICIA,1997年,1302页
  • 3Yin Baoqun,Tan Zizhonged.Proceedings of CWCICIA,1997年,1884页
  • 4陈国良,并行算法.设计与分析,1994年
  • 5郑大钟,自动化学报,1992年,18卷,2期,129页
  • 6Ho Y C,Perturbation Analysis Discrete Event Dynamic Systems,1991年
  • 7Heidelberger P,Management Science,1988年,34卷,11期,1281页
  • 8Cao X R,Performance Evaluation,1987年,7卷,31页
  • 9郑大钟,郑应平.离散事件动态系统理论:现状和展望[J].自动化学报,1992,18(2):129-142. 被引量:39

共引文献8

同被引文献9

  • 1代桂平,殷保群,王肖龙,奚宏生.受控M/G/1排队系统的性能优化及迭代算法[J].系统仿真学报,2004,16(8):1683-1685. 被引量:3
  • 2唐昊,周雷,袁继彬.平均和折扣准则MDP基于TD(0)学习的统一NDP方法[J].控制理论与应用,2006,23(2):292-296. 被引量:5
  • 3胡奇英,刘建庸.马尔可夫控制过程引论[M].西安:西安电子科技大学出版社,2000.
  • 4Bertsekas D P, Tsitsiklis J N.. Neuro-Dynamic Programming [M]. Belmont, MA: Athena Scientific, 1996.
  • 5Bertsekas D P, Tsitsiklis J N, Wu C. Rollout algorithms for combinatorial optimization [J]. Heuristics (S1381-1237), 1997, 3(3): 245-262.
  • 6Bertsekas D P. Differential training of rollout policies [C]//Proc. of the 35^th Allerton Conference on Communication, Control, and Computing. Allerton Park, Ⅲ, 1997.
  • 7Cao X R, Chen H E Perturbation realization, potentials and Sensitivity analysis of Markov processes [J]. IEEE Trans. on Automatic Control (S0018-9286), 1997, 42(10): 1382-1393.
  • 8Cao X R. Single sample path-based optimization of Markov chains [J]. Journai of Optimization Theory and Applications (S0022-3239), 1999, 100(3): 527-548.
  • 9Cao X R. From perturbation analysis to Markov decision processes and reinforcement learning [J]. Discrete Event Dynamic Systems: Theory and Applications (S0924-6703), 2003, 13(1): 9-39.

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部