期刊文献+

预测状态表示模型的复位算法

An Algorithm for Resetting PSR Models
下载PDF
导出
摘要 预测状态表示(Predictive State Representations,PSRs)是用于解决局部可观测问题的有效方法.然而,现实环境中,通过样本学习得到的PSR模型不可能完全准确.随着计算步数的增多,利用PSR模型计算得到的预测向量有可能越来越偏离其真实值,进而导致PSR模型的预测精度越来越低.文中提出了一种PSR模型的复位算法.通过使用判别分析方法确定系统所处的PSR状态,文中所提算法可对利用计算获取的预测向量复位,从而提高PSR模型的准确性.实验结果表明,采用复位算法的PSR模型在预测精度上明显优于未采用复位算法的PSR模型,验证了所提算法的有效性. Predictive State Representations (PSRs) have been proposed as an alternative to partially observable Markov decision processes (POMDPs) to model dynamical systems. Although POMDPs and PSRs provide general frameworks for solving partially observable problems, in real world applications, when the PSR model of a system is learned from samples, it will almost certainly result in an inaccurate PSR model. Therefore the prediction vector calculated using this model may progressively drift farther and farther away from reality, which will result in lower prediction accuracy of the PSR model. This paper describes an algorithm for resetting the learned PSR models. First, for the inaccurate PSR model, the PSR state is identified using discriminant function analysis, then the calculated prediction vector can be reset for the purpose of improving the veracity of the PSR model. The algorithms with and without resetting the PSR model are compared, empirical results show that in case of the obtained PSR model's prediction quality, the algorithm with resetting the prediction vector has better prediction accuracy than the algorithm without resetting the prediction vector, which proves the effectiveness of the proposed algorithm.
出处 《计算机学报》 EI CSCD 北大核心 2012年第5期1046-1051,共6页 Chinese Journal of Computers
基金 福建省自然科学基金(2010J05140) 高等学校博士学科点专项科研基金(20100121120022) 国家自然科学基金(60774033)资助~~
关键词 预测状态表示模型 预测精度 复位 判别分析 预测状态表示模型的准确性 Predictive State Representation (PSR) model prediction accuracy reset discriminant analysis veracity of the Predictive State Representation model
  • 相关文献

参考文献13

  • 1Singh S, James M, Rudary M. Predictive state representa- tions: A new theory for modeling dynamical systems//Pro- ceedings of the 20th Conference in Uncertainty in Artificial Intelligence. Banff, Canada, 2004:512-518.
  • 2James M. Using predictions for planning and modeling in sto- chastic environments [Ph. D. dissertation]. University of Michigan, Ann Arbor, USA, 2005.
  • 3McCallum R A. Hidden state and reinforcement learning with instance-based state identification. IEEE Transactions on Systems Man and Cybernetics, Part B: Cybernetics, 1996, 26(3): 464-473.
  • 4Kaelbling L P, Littman M L, Cassandra A R. Planning and acting in partially observable stoehastic domains. Artificial Intelligence, 1998, 101: 99-134.
  • 5LittmanM, Sutton R, Singh S. state//Proceedings of the 2001 ing System(NIPS) Conferenee. 1555-1561 Predictive representations of Neural Information Process- Vancouver, Canada, 2002:.
  • 6James M, Singh S. Learning and discovery of predictive state representations in dynamical Systems with reset//Proceed- ings of the 21st International Conference on Machine Learn- ing. Banff, Canada, 2004:417-424.
  • 7Wolfe B, James M, Singh S. Learning predictive state repre- sentations in dynamical systems without reset//Proceedings of the 22nd International Conference on Machine Learning. Bonn, Germany, 2005:980-987.
  • 8刘云龙,李人厚.发现和学习不可复位动态系统的预测状态表示的一种新算法[J].电子学报,2009,37(1):126-131. 被引量:2
  • 9Dinculescu M, Precup D. Approximate predictive representations of partially observable systems//Proceedings of the 27th International Conference on Machine Learning (ICML'10). Haifa, Israel, 2010:985-1002.
  • 10Liu Yunlong, Ji Guoli, Yang Zijiang. Using learned PSR models for planning under uncertainty//Proeeedings of the 23rd Canadian Conference on Artificial Intelligence. Univer- sity of Ottawa, Ontario, Canada, 2010:309-314.

二级参考文献18

  • 1KAELBLING L P, LITTMAN M L. CASSANDRA A R. Planning and acting in partially observable stochastic domains[J].Artificial Intelligence, 1998, 101 (1/2):99-134.
  • 2LITTLEMAN M L, SUTTON R S, SINGH S. Predictive representation of state[M]//Advances in Neural Information Processing Systems 14. Cambridge, MA, USA: MIT Press, 2002:1555-1561.
  • 3SINGH S, JAMES M R, RUDARY M R. Predictive state representations: a new theory for modeling dynamical systems[C]//Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence. Alberta,Canada: AUAI Press,512-519.
  • 4MCCALLUM A K. Reinforcement learning with selective perception and hidden state[D]. University of Rochester. Department of Computer Science, 1995.
  • 5JAMES M R, SINGH S. I.earning and discovery of predictive state representations in dynamical systems with reset[C]//Proceedings of the 21st International Conference on Machine Learning. New York, USA ACM, 2004:417-424.
  • 6WOLFE B, JAMES M R, SINGH S. Learning predictive state representations in dynamical systems without reset[C]//Proceeding of the 22nd International Conference on Machine Learning. New York, USA ACM, 2005: 985-992.
  • 7SUTTON R S, BRATO A G. Reinforcement learning:an introduction[M]. Cambridge, MA, USA: MIT Press, 1998.
  • 8A Cassandra. Tony' s POMDP file repository page [OL ]. http://www, cs. brown, edu/research/ai/pomdp/examples / index . html, 2008 - 06 - 02.
  • 9M Bowling, P McCracken, M James, et al. Learning predictive state representations using non-blind policies[ A]. In Proceedings of the Twenty-Third International Conference on Machine Learning [C ]. Pittsburgh, Pennsylvania, USA: ACM, 2006. 129-136.
  • 10S Singh, M Litlman, N Jong, et al. Learning predictive state representations[A ]. In Twentieth International Conference on Machine Learning [C]. Washington, DC, USA: AAAI Press, 2003.712 - 719.

共引文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部