期刊文献+

POMDP基于点的值迭代算法中一种信念选择方法 被引量:3

A Belief Selection Method in POMDP Point-Based Value Iteration Algorithm
下载PDF
导出
摘要 部分可观察马尔可夫决策过程(POMDP)是描述不确定环境下进行决策的数学模型.基于点的值迭代算法是求解POMDP问题的一类近似解法.针对基于点的算法中信念选择这一关键问题,提出了一种基于熵的信念选择方法(EBBS).EBBS算法通过计算可以转移到的信念点的不确定性,选择熵较小且到当前信念点集距离大于一定阈值的信念点扩充信念点集合.实验结果表明,通过熵选择信念点的值迭代算法只需要在较少数量的信念点上进行值迭代操作就能得到预期的折扣报酬. Partially Observable Markov Decision Process (POMDP) provides a mathematical model for decision making under uncertainty. Point-Based value iteration algorithms are effective proximate algorithms to solve POMDP problems. In this paper we propose a belief selection method, Entropy-Based Belief Selection (EBBS), based on the entropy of belief points to the crucial issue of point-based algorithms. The EBBS algorithm first sorts the belief points by entropy and then selects belief that has lower entropy and whose distance to the current set is more than a threshold. And the experimental results illustrate that this method could perform value iteration operation on fewer belief points to gain an expected discounted reward.
出处 《北京交通大学学报》 CAS CSCD 北大核心 2009年第5期77-80,共4页 JOURNAL OF BEIJING JIAOTONG UNIVERSITY
基金 国家自然科学基金资助项目(90709006) 国家"973"项目资助(2006CB504601) 北京市科委重大计划项目资助(H020920010130) 国家科技支撑计划项目资助(2007BA110B06-01)
关键词 POMDP 值迭代 基于点的算法 信念选择 不确定性 POMDP value iteration point-based point algorithm belief selection uncertainty
  • 相关文献

参考文献9

  • 1Sondik E J. The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs[J]. Operations Research, 1978, 26(2): 282-304.
  • 2Kaelbling L P, Littman M L, Cassandra AR. Planning and Acting in Partially Observable Stochastic Domains[ C] // Artificial Intelligence, 1998, 101: 99- 134.
  • 3刘克.实用马尔科夫决策过程[M].北京:清华大学出版社,2004.
  • 4Zhang N L, Zhang W. Speeding Up the Convergence of Value Iteration in Partially Observable Markov Decision Processes[J]. Journal of Artificial IntelLigence Research, 2001(14): 29-51.
  • 5Pineau J, Gordon G, Thrun S. Point-Based Value Iteration: An Anytime Algorithm for POMDPs[C]//// Proc. Int. Joint Conf. on Artificial Intelligence (IJCAI), Acapulco, Mexico,2003: 1025-1030.
  • 6Izadi M T, Precup D, Azar D. Belief Selection in Point- Based Planning Algorithms for Pomdps[ C]// Proceedings of Canadian Conference on Artificial Intelligence (AI), Quebec City, Canada, 2006: 383- 394.
  • 7Izadi M T, Precup D. Exploration in POMDP Belief Space and Its Impact on Value Iteration Approximation[ C]// European Conference on Artificial Intelligence (ECAI). Riva del Garda, Italy, 2006.
  • 8Shani G, Brafman R I, Shimony S E. Forward Search Value Iteration For POMDPs[ C]//Proc. Int. Joint Conf. on Artificial Intelligence(IJCAI), 2007 : 2619 - 2624.
  • 9Pineau J, Gordon G, Thrun S. Point-Based Approximations for Fast POMDP Solving[R]. Technical Report, SOCS-TR-2005.4, School of Computer Science, McGill University, 2005:1 - 45.

共引文献9

同被引文献18

  • 1Kaelbling L P, Littman M L, Cassandra A R. Planning and act- ing in partially observable stochastic domains[J]. Artificial In- telligence, 1998, 101(1/2): 99-134.
  • 2Cassandra A, Littman M, Zhang N. Incremental pruning: A simple, fast, exact method for partially observable Markov de- cision processes[C]//Proceedings of the Thirteenth Conference on Uncertainty in Artificial Intelligence. San Francisco, USA: Morgan Kaufmann Publishers, 1997: 54-61.
  • 3Borera E C, Pyeatt L D, Randrianasolo A S, et al. POMDP fil- ter: Pruning POMDP value functions with the Kaczmarz itera- tive method[M]//Lecture Notes in Computer Science: vol.6437. Berlin, Germany: Springer-Verlag, 2010: 254-265.
  • 4Pineau J, Gordon G, Thrun S. Anytime point-based approxi- mations for large POMDPs[J]. Journal of Artificial Intelligence Research, 2006, 27(1): 335-380.
  • 5Roy N, Gordon G, Thrun s. Finding approximate POMDP so- lutions through belief compression[J]. Journal of Artificial In- telligence Research 2005, 23(1): 1-40.
  • 6Zhang N L, Zhang W H. Speeding up the convergence of value iteration in partially observable Markov decision processes[J]. Journal of Artificial Intelligence Research, 2001, 14(1): 29-51.
  • 7Izadi M T, Precup D, Azar D. Belief selection in point-based planning algorithms for POMDPs[C]//Canadian Society for Computational Studies of Intelligence Conference. Berlin, Ger- many: Springer-Verlag, 2006: 383-394.
  • 8Shani G. Evaluating point-based POMDP solvers on multicore machines[J]. IEEE Transactions on Systems, Man, and Cyber- netics, Part B: Cybernetics, 2010, 40(4): 1062-1074.
  • 9Cassandra A R. A Survey of POMDP Applications[C]//Proc. of Symposium on Planning with Partially Observable Markov Decision Processes. [S. 1.]: AAAI Press, 1998.
  • 10Zhang Shiqi, Sridharan M. Vision-based Scene Analysis on Mobile Robots Using Layered POMDPs[C]//Proc. of International Conference on Automated Planning and Scheduling. Toronto, Canada: [s. n.], 2010.

引证文献3

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部