摘要
在部分可观测马尔可夫决策过程(POMDP)的基础上,给出一阶部分可观测马尔科夫决策过程(FO-POMDP),用一阶逻辑的情景演算结构表达POMDP。对FO-POMDP模型中状态的抽象层次进行刻画,提出状态粒度、信念状态粒度的概念。采用粒度归结方法,将信念状态的粒度归结到某一确定粒度下,运用确定粒度下的信念点距离度量方法,将基于点的价值迭代(PBVI)扩展到逻辑抽象层面提出一阶PBVI(FO-PBVI)。实验结果证明,该算法的求解速度较快,求解质量较好。
This paper presents the First Order-Partially Observable Markov Decision Processes(FO-POMDP), which is a logical expression of POMDP using situation calculus. And the level of abstraction is an important problem for solving the FO-POMDE The concept of the granularity of states and the granularity of belief states are proposed, The level of abstraction can be characterized by the granularity. The method of granularity resolution can convert the granularity of belief states. And the distance of different belief states is also presented. The Point-based Value Iteration(PBVI) is extended to the logic level. Experimental results show that the solving speed of this algorithm is faster, and is of better oualitv.
出处
《计算机工程》
CAS
CSCD
2013年第10期217-220,共4页
Computer Engineering
基金
国家自然科学基金资助项目(71071160)
关键词
部分可观测马尔科夫决策过程
状态空间
信念状态
粒度归结
基于点的值迭代
Partially Observable Markov Decision Processes(POMDP)
state space
belief state
granularity resolution
Point-basedValue Iteration(PBVI)