预测状态表示综述

Survey of predictive state representations

下载PDF

导出

摘要预测状态表示是描述离散时间有限状态的动态系统的新方法。使用动作—观测值序列的预测向量表示系统状态在将来时刻发生的概率,能解决现有动态系统决策过程中计算复杂的问题。综述了预测状态表示的基本原理,介绍了预测状态表示的建模过程和规划算法,对已有的建模方法和规划方法进行总结分析和比较,指出了该研究领域的发展方向,最后提出了研究面临的挑战。 Predictive state representations （ PSRs ） are new models for discrete-time finite action and observation stochastic systems. Because a PSR represents the system＇ s state as a set of predictions of the observable outcomes of tests performed in the system, it can solve the computing problems in exist stochastic decision systems. This paper introduced the principles of PSR models, surveyed the PSR model and planning techniques, analyzed and compared the fundamental principles behind the modeling and planning algorithms of PSR, pointed out the development trend, and gave the challenges that the research of PSR was facing.

作者雷珠刘峰赵志宏

机构地区南京大学软件学院

出处《计算机应用研究》 CSCD 北大核心 2010年第2期401-404,共4页 Application Research of Computers

基金国家自然科学基金资助项目(60775046)

关键词动态系统预测状态表示发现核心测试学习模型参数规划算法 stochastic systems predictive state representations（PSR） discovery core-test learning parameters planning

分类号 TP18 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

参考文献13

1LITTMAN M, SUTTON R, SINGH S. Predictive representations of state[ C ]//Proc of the Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2002 : 1555- 1561.
2SINGH S, JAMES M, RUDARY M. Predictive state representations: a new theory for modeling dynamical systems [ C ]//Prec of the 20th Annum Conference on Uncertainty in Artificial Intelligence. 2004: 512-519.
3SINGH S, LITTMAN M, JONG N. Learning predictive state representations [ C ]//Proc of the 20th International Conference on Machine learning. 2003:712-719.
4JAMES M, SINGH S. Learning and discovery of predictive state representations in dynamical systems with reset[ C ]//Proc of the 21st International Conference on Machine Learning. 2004:53-60.
5WOLFE B, JAMES M, SINGH S. Learning predictive state representations in dynamical systems without reset[ C]//Proc of ACM International Conference Proceeding Series. New York:ACM Press, 2005: 985- 992.
6MeCRACKEN P, BOWLING M. Online discovery and learning of predictive state representations[ C ]//Proc of the Advances in Neural Information Processing Systems. 2006:875- 882.
7JAMES M, SINGH S, LITrMAN M. Planning with predictive state representations[ C]//Proc of International Conference on Machine Learning and Applications. 2004:304-311.
8ALBUS J S. A theory of cerebellar function[J]. Mathematical Biosciences, 1971, 10 (1-2) :25- 61.
9SUTTON R, BARTO A. Reinforcement Learning: an introduction [M]. Cambridge:MIT Press, 1998.
10JAMES M, WOLFE B, SINGH S. Combining the memory and landmarks with predictive state representations[ C]//Proc of the 19th International Joint Conference on Artificial Intelligence. 2005:734- 739.

1王历,高阳,王巍巍.预测状态表示综述[J].山东大学学报（工学版）,2010,40(4):23-28. 被引量：1
2汪庆淼,鞠时光.基于预测状态表示的多变量概率系统预测[J].计算机应用,2012,32(11):3044-3046. 被引量：2
3刘云龙,李人厚.发现和学习不可复位动态系统的预测状态表示的一种新算法[J].电子学报,2009,37(1):126-131. 被引量：2
4刘云龙,李人厚,刘建书.基于预测状态表示的Q学习算法[J].西安交通大学学报,2008,42(12):1472-1475. 被引量：3
5汪庆淼,鞠时光.基于预测状态表示模型和稀疏分布记忆的多观测系统预测[J].计算机应用研究,2012,29(8):2988-2990.
6郭华平,董亚东,邬长安,范明.面向类不平衡的逻辑回归方法[J].模式识别与人工智能,2015,28(8):686-693. 被引量：10
7刘云龙,吉国力.预测状态表示模型的复位算法[J].计算机学报,2012,35(5):1046-1051.
8于江德,王希杰,余正涛.基于最大熵模型的语义角色标注[J].微电子学与计算机,2010,27(8):173-176. 被引量：7
9史苇杭,林楠.一种联合的时序数据特征序列分类学习算法[J].计算机工程,2016,42(6):196-200. 被引量：4
10于江德,李学钰,樊孝忠,庞文博.最大熵模型的事件分类[J].电子科技大学学报,2010,39(4):612-616. 被引量：7

计算机应用研究

2010年第2期

浏览历史

内容加载中请稍等...

预测状态表示综述

参考文献13

相关作者

相关机构

相关主题

浏览历史