期刊文献+

在部分观测环境下学习规划领域的派生谓词规则 被引量:2

Learning Derived Predicate Rules for Planning Domains under Partial Observability
下载PDF
导出
摘要 文中提出了一种在部分观测环境下学习规划领域的派生谓词规则的方法.在规划领域描述语言(PDDL)中,派生谓词用来描述动作的非直接效果,是规划领域模型和搜索控制知识的重要组成部分.然而,对于大多数规划领域而言,从无到有地构造派生谓词规则是不容易的.因此,研究自动获取派生谓词的推导规则是有意义的.已有研究工作提出通过修订一个初始的不完备的领域理论来获取推导规则的方法,但是它们的主要缺点在于待学习谓词的训练例的数量非常少,这是因为训练例按照非常有限的方式来生成.而更本质的原因在于它们假设环境是不可观测的.其实,在现实生活中很多动作的非直接效果是可以观测的,或者通过简单的目测或者通过专门的工具.因此文中提出增加观测来反映动作的非直接效果,以便增加待学习谓词的训练例数目从而改善学习的精准度.此外,为了补充一些在归纳学习过程中学习不到的谓词,文中还提出了一个后处理方法来使得学习到的规则在语义上更完整.通过在派生谓词基准领域上的实验表明,文中所提出的方法是可行有效的.更深远的意义在于,文中的研究工作有利于规划领域的自动建模或者控制知识的自动获取的研究与实现. This paper presents a method to learn derived predicate rules for planning domains under partial observability.In the PDDL(Planning Domain Description Language),derived predicates are a compact way to describe indirect effects of actions,and an important part of planning domain models or search control knowledge.However,for most planning domains,it is not easy to write derived predicate rules from scratch,even for experts.Therefore,it is worthy of studying how to automatically acquire rules for derived predicates from observed plans.There has been some research work on gaining derived rules by refining an initial and imperfect domain theory.But,their primary disadvantage was that the number of training examples for predicates to be learned was very small since training examples were produced in a very limited way.The underlying reason was that they assumed that the environment was unobservable.In fact,in the real world,many indirect effects of actions are observable by simple eye-measurement or tools.This paper uses observations to reflect actions' indirect effects in order to increase the number of trainingexamples and to improve the learning accuracy.Also,to complement some predicates which cannot be learned by the inductive learning method,this paper gives a post-processing algorithm to make the semantics of learned rules more perfect.Experiments on some benchmark domains show that,the method presented in this paper is feasible and effective.And further,the work in this paper is beneficial for the study on automatically modeling planning domains and automatically acquiring control knowledge.
出处 《计算机学报》 EI CSCD 北大核心 2015年第7期1372-1385,共14页 Chinese Journal of Computers
基金 中央高校基本科研业务费专项资金(21615438) 国家自然科学基金(61100134 61003179 61272073) 广东省自然科学基金(S2013020012865 S2011040001427)资助~~
关键词 人工智能 自动规划 派生谓词 规则学习 部分观测 artificial intelligence automated planning derived predicates rule learning partial observability
  • 相关文献

参考文献28

  • 1Ghallab M, Nau D, Traverso P. Automated Planning Theory and Practice. Burlington, Massachusetts: Morgan Kaufmann Publishers, 2003.
  • 2Edelkamp S, Hoffmann J. PDDL2.2: The language for the classical part of the fourth international planning competition. Albert Ludwigs Universitat, Institut fur Informatik, Freiburg, Germany: TR-195, 2004.
  • 3Yoon S, Fern A, Givan R. Learning control knowledge for forward search planning. Journal of Machine Learning Research, 2008, 9 683-718.
  • 4Rosa T, McIlraith S. Learning domain control knowledge for TLPIan and heyond//Proceedings of the 3rd Workshop on Planning and Learning in International Conference on Automated Planning and Scheduling. Freiburg, Germany, 2011 : 36-43.
  • 5Thiebaux S, Hoffmann J, Nebel B. In defense of PDDL axioms. Artificial Intelligence, 2005, 168(1-2) : 38-69.
  • 6Bacchus F, Kabanza F. Using temporal logics to express search control knowledge for planning. Artificial Intelligence, 2000, 116(1-2): 123-191.
  • 7饶东宁,蒋志华,姜云飞,刘强.从规划解中学习一阶派生谓词规则[J].计算机学报,2010,33(2):251-266. 被引量:11
  • 8Zettlemoyer L, Pasula H, Kaelbling L. Learning planning rules in noisy stochastic worlds/Proceedings of the 20th National Conference on Artificial Intelligence. Pittsburgh, USA, 2005.. 911-918.
  • 9饶东宁,蒋志华,姜云飞,吴康恒.从WSBPEL程序中学习Web服务的不确定动作模型[J].计算机研究与发展,2010,47(3):445-454. 被引量:10
  • 10Zhuo H, Yang Q, Hu D, Li L. Learning complex action models with quantifiers and logical implications. Artificial Intelligence, 2010, 174(18), 1540-1569.

二级参考文献89

  • 1史玉良,黄光安,叶炜,张亮,施伯乐.基于任务依赖信息的Web服务自动合成[J].计算机研究与发展,2006,43(12):2110-2116. 被引量:8
  • 2蒋志华,姜云飞.基于与状态无关的激活集的包含派生谓词的规划问题求解[J].计算机科学,2007,34(3):176-180. 被引量:3
  • 3邱莉榕,史忠植,林芬,常亮.基于主体的语义Web服务自动组合研究[J].计算机研究与发展,2007,44(4):643-650. 被引量:27
  • 4赖志锋,姜云飞.智能规划中基于遗传算法的动作模型学习[J].计算机学报,2007,30(6):945-953. 被引量:6
  • 5Thiebaux S, Hoffmann J, Nebel B. In defense of PDDL axioms. Artifieial Intelligence, 2005, 168(1/2) : 38-69.
  • 6Fikes R, Nilsson N. STRIPS: A new approach to the application of theorem proving to problem solving. Artificial Intelligence, 1971, 2(3/4) : 189-208.
  • 7Thielscher M. Ramification and causality. Artificial Intelligence, 1997, 89(1/2):317-364.
  • 8Yang Qing, Wu Kang-Hen, Jiang Yun-Fei. Learning action models from plan examples using weighted MAX-SAT. Artificial Intelligence, 2007, 171(2-3): 107-143.
  • 9Amir E. Learning partially observable deterministic action models//Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI 2005). Edinburgh, Scotland, UK, 2005:1433-1439.
  • 10Ilghami O, Munoz-Avila H, Nau D S, Aha D W. Learning preconditions for planning from plan traces and HTN struc ture. Journal of Artificial Intelligence Research, 2005, 21 (4) : 388-413.

共引文献18

同被引文献7

引证文献2

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部