期刊文献+

基于点的FO-POMDP值迭代方法研究 被引量:1

Research on Point-based Value Iteration Method for FO-POMDP
下载PDF
导出
摘要 在部分可观测马尔可夫决策过程(POMDP)的基础上,给出一阶部分可观测马尔科夫决策过程(FO-POMDP),用一阶逻辑的情景演算结构表达POMDP。对FO-POMDP模型中状态的抽象层次进行刻画,提出状态粒度、信念状态粒度的概念。采用粒度归结方法,将信念状态的粒度归结到某一确定粒度下,运用确定粒度下的信念点距离度量方法,将基于点的价值迭代(PBVI)扩展到逻辑抽象层面提出一阶PBVI(FO-PBVI)。实验结果证明,该算法的求解速度较快,求解质量较好。 This paper presents the First Order-Partially Observable Markov Decision Processes(FO-POMDP), which is a logical expression of POMDP using situation calculus. And the level of abstraction is an important problem for solving the FO-POMDE The concept of the granularity of states and the granularity of belief states are proposed, The level of abstraction can be characterized by the granularity. The method of granularity resolution can convert the granularity of belief states. And the distance of different belief states is also presented. The Point-based Value Iteration(PBVI) is extended to the logic level. Experimental results show that the solving speed of this algorithm is faster, and is of better oualitv.
出处 《计算机工程》 CAS CSCD 2013年第10期217-220,共4页 Computer Engineering
基金 国家自然科学基金资助项目(71071160)
关键词 部分可观测马尔科夫决策过程 状态空间 信念状态 粒度归结 基于点的值迭代 Partially Observable Markov Decision Processes(POMDP) state space belief state granularity resolution Point-basedValue Iteration(PBVI)
  • 相关文献

参考文献10

  • 1Cassandra A R. A Survey of POMDP Applications[C]//Proc. of Symposium on Planning with Partially Observable Markov Decision Processes. [S. 1.]: AAAI Press, 1998.
  • 2Zhang Shiqi, Sridharan M. Vision-based Scene Analysis on Mobile Robots Using Layered POMDPs[C]//Proc. of International Conference on Automated Planning and Scheduling. Toronto, Canada: [s. n.], 2010.
  • 3Ong S C W, Png S W, Hsu D, et al. Planning Under Uncertainty for Robotic Tasks with Mixed Observability[J]. International Journal of Robotics Research, 2010, 29(8): 1053-1068.
  • 4Wang Chenggang, Schmolze J. Planning with POMDPs Using a Compact, Logic-based Representation[C]//Proc. of the 17th International Conference on Tools with Artificial Intelligence. [S. 1.]: IEEE Computer Society, 2005: 523-530.
  • 5Wang Chenggang, Brodley C, Mahadevan S, et al. First Order Markov Decision Processes[D]. Medford, USA: Tufts University, 2007.
  • 6Wang Chenggang, Khardon R. Relational Partially Observable MDPs[C]//Proc. of the 24th AAAI Conference on Artificial Intelligence. Atlanta, Georgia: AAAI Press, 2010:1153-1157.
  • 7Sanner S, Kersting K. Symbolic Dynamic Programming for First-order POMDPs[C]//Proc. of the 24th AAAI Conference on Artificial Intelligence. Atlanta, Georgia: AAAI Press, 2010.
  • 8卞爱华,王崇骏,陈世福.基于点的POMDP算法的预处理方法[J].软件学报,2008,19(6):1309-1316. 被引量:6
  • 9冯奇,周雪忠,黄厚宽,张小平.POMDP基于点的值迭代算法中一种信念选择方法[J].北京交通大学学报,2009,33(5):77-80. 被引量:3
  • 10Pineau J, Gordon G, Thrun S. Point-based Value Iteration: An Anytime Algorithm for POMDPs[C]//Proc. of the 18th International Joint Conference on Artificial Intelligence. San Francisco, USA: Morgan Kaufmann Publishers Inc., 2003.

二级参考文献12

  • 1周继恩,刘贵全,张春阳,蔡庆生.基于内部信念状态POMDP模型在用户兴趣获取中的应用[J].小型微型计算机系统,2004,25(11):1979-1983. 被引量:5
  • 2陈茂,陈小平.基于采样的POMDP近似算法[J].计算机仿真,2006,23(5):64-67. 被引量:2
  • 3Sondik E J. The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs[J]. Operations Research, 1978, 26(2): 282-304.
  • 4Kaelbling L P, Littman M L, Cassandra AR. Planning and Acting in Partially Observable Stochastic Domains[ C] // Artificial Intelligence, 1998, 101: 99- 134.
  • 5Zhang N L, Zhang W. Speeding Up the Convergence of Value Iteration in Partially Observable Markov Decision Processes[J]. Journal of Artificial IntelLigence Research, 2001(14): 29-51.
  • 6Pineau J, Gordon G, Thrun S. Point-Based Value Iteration: An Anytime Algorithm for POMDPs[C]//// Proc. Int. Joint Conf. on Artificial Intelligence (IJCAI), Acapulco, Mexico,2003: 1025-1030.
  • 7Izadi M T, Precup D, Azar D. Belief Selection in Point- Based Planning Algorithms for Pomdps[ C]// Proceedings of Canadian Conference on Artificial Intelligence (AI), Quebec City, Canada, 2006: 383- 394.
  • 8Izadi M T, Precup D. Exploration in POMDP Belief Space and Its Impact on Value Iteration Approximation[ C]// European Conference on Artificial Intelligence (ECAI). Riva del Garda, Italy, 2006.
  • 9Shani G, Brafman R I, Shimony S E. Forward Search Value Iteration For POMDPs[ C]//Proc. Int. Joint Conf. on Artificial Intelligence(IJCAI), 2007 : 2619 - 2624.
  • 10Pineau J, Gordon G, Thrun S. Point-Based Approximations for Fast POMDP Solving[R]. Technical Report, SOCS-TR-2005.4, School of Computer Science, McGill University, 2005:1 - 45.

共引文献7

同被引文献5

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部