期刊文献+

有限规划水平部分可观Markov自适应决策过程的参数决策

Parameter Decision in Adaptive Partially Observable Markov Decision Process with Finite Planning Horizon
下载PDF
导出
摘要 提出了一种有限规划水平部分可观、不确定 Markov决策过程自适应决策算法 .算法的基本思想是运用 Bayes理论对未知系统进行“学习”,通过最小决策失误概率的参数决策实现参数估计 ,在参数估计的基础上进行控制决策从而以最大概率实现最优决策 .文中证明了决策算法的收敛性 .仿真结果表明了决策算法的有效性 . An algorithm was proposed for adaptive POMDP with finite planning horizon. In the algorithm, Bayes method is used to learn the unknown system, and the principle of minimum decision error probability is applied for parameter estimation. The control is obtained based on estimated parameter so that the probability that every decision being optimal is maximized. The convergence of the algorithm was proved and the effectiveness of the algorithm was demonstrated by the simulation.
出处 《上海交通大学学报》 EI CAS CSCD 北大核心 2000年第12期1653-1657,共5页 Journal of Shanghai Jiaotong University
基金 国家自然科学基金资助项目! (6 98740 2 5 )
关键词 部分可观Markov决策过程 自适应控制 贝叶斯原理 Adaptive control systems Learning algorithms Markov processes Optimization Parameter estimation
  • 相关文献

参考文献10

  • 1[1]Wallace J H, Yar-Lin Kuol. An optimal structured policy for maintenance of partially observable aircraft engine components [J]. Naval Research Logistics,1998,45(4) :335~352.
  • 2[2]Nacy Gautreau, Soumaya Yacout, Rejean Hall. Simulation of partially observed Markov decision process and dynamic quality improvement[J]. Computers Ind Engng, 1997,32(4) :691 ~700.
  • 3[3]Hernandez Lerma O. Marcus S I. Adaptive control of Markov processes with incomplete state inform tion and unknown parameters [J]. Journal of Optimization Theory and Applications, 1987,52 (2): 227~241.
  • 4[4]Fernandez Gaucherand E. A methodology for the adaptive control of Markov chains under partial state information[A]. Proceedings of the 31st IEEE Decision and Control[C],1992.2 750~2 751.
  • 5[5]Fernandez Gaucherand E, Arapostathis A, Marcus S I. Analysis of an adaptive control scheme for a par tially observed controlled Markov chain[J]. IEEE Transactions on Automatic Control, 1993,38(6): 987~993.
  • 6[6]Monahan G E. A survey of partially observable Markov decision process: theory, models, and alogrithms[J]. Management Science,1982, 28 (1) : 1~16.
  • 7[7]Sondic E, Offensend F. The optimal control of partially observable Markov processes over a finite horizon [J]. Operation Research,1973,21(5):1 071~1 088.
  • 8[8]Melsa J L, Cohn D L. Decision and estimation theory[M]. New York: McGraw-Hill Book Company,1978.96~ 110.
  • 9李江洪,韩正之.有限规划水平自适应Markov决策过程的参数决策[J].应用科学学报,2000,18(4):335-339. 被引量:1
  • 10[11]Doob J L. Stochastic Processes[M]. New York: John Wiley, 1953.

二级参考文献1

  • 1言茂松,贝叶斯风险决策工程,1989年,31页

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部