在部分观测环境下学习规划领域的派生谓词规则被引量：2

Learning Derived Predicate Rules for Planning Domains under Partial Observability

下载PDF

导出

摘要文中提出了一种在部分观测环境下学习规划领域的派生谓词规则的方法.在规划领域描述语言(PDDL)中,派生谓词用来描述动作的非直接效果,是规划领域模型和搜索控制知识的重要组成部分.然而,对于大多数规划领域而言,从无到有地构造派生谓词规则是不容易的.因此,研究自动获取派生谓词的推导规则是有意义的.已有研究工作提出通过修订一个初始的不完备的领域理论来获取推导规则的方法,但是它们的主要缺点在于待学习谓词的训练例的数量非常少,这是因为训练例按照非常有限的方式来生成.而更本质的原因在于它们假设环境是不可观测的.其实,在现实生活中很多动作的非直接效果是可以观测的,或者通过简单的目测或者通过专门的工具.因此文中提出增加观测来反映动作的非直接效果,以便增加待学习谓词的训练例数目从而改善学习的精准度.此外,为了补充一些在归纳学习过程中学习不到的谓词,文中还提出了一个后处理方法来使得学习到的规则在语义上更完整.通过在派生谓词基准领域上的实验表明,文中所提出的方法是可行有效的.更深远的意义在于,文中的研究工作有利于规划领域的自动建模或者控制知识的自动获取的研究与实现. This paper presents a method to learn derived predicate rules for planning domains under partial observability.In the PDDL（Planning Domain Description Language）,derived predicates are a compact way to describe indirect effects of actions,and an important part of planning domain models or search control knowledge.However,for most planning domains,it is not easy to write derived predicate rules from scratch,even for experts.Therefore,it is worthy of studying how to automatically acquire rules for derived predicates from observed plans.There has been some research work on gaining derived rules by refining an initial and imperfect domain theory.But,their primary disadvantage was that the number of training examples for predicates to be learned was very small since training examples were produced in a very limited way.The underlying reason was that they assumed that the environment was unobservable.In fact,in the real world,many indirect effects of actions are observable by simple eye-measurement or tools.This paper uses observations to reflect actions＇ indirect effects in order to increase the number of trainingexamples and to improve the learning accuracy.Also,to complement some predicates which cannot be learned by the inductive learning method,this paper gives a post-processing algorithm to make the semantics of learned rules more perfect.Experiments on some benchmark domains show that,the method presented in this paper is feasible and effective.And further,the work in this paper is beneficial for the study on automatically modeling planning domains and automatically acquiring control knowledge.

作者饶东宁蒋志华姜云飞邓玉辉

机构地区广东工业大学计算机学院暨南大学信息科学与技术学院计算机科学系中山大学信息科学与技术学院软件研究所

出处《计算机学报》 EI CSCD 北大核心 2015年第7期1372-1385,共14页 Chinese Journal of Computers

基金中央高校基本科研业务费专项资金(21615438) 国家自然科学基金(61100134 61003179 61272073) 广东省自然科学基金(S2013020012865 S2011040001427)资助~~

关键词人工智能自动规划派生谓词规则学习部分观测 artificial intelligence automated planning derived predicates rule learning partial observability

分类号 TP182 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

参考文献28

1Ghallab M, Nau D, Traverso P. Automated Planning Theory and Practice. Burlington, Massachusetts: Morgan Kaufmann Publishers, 2003.
2Edelkamp S, Hoffmann J. PDDL2.2: The language for the classical part of the fourth international planning competition. Albert Ludwigs Universitat, Institut fur Informatik, Freiburg, Germany: TR-195, 2004.
3Yoon S, Fern A, Givan R. Learning control knowledge for forward search planning. Journal of Machine Learning Research, 2008, 9 683-718.
4Rosa T, McIlraith S. Learning domain control knowledge for TLPIan and heyond//Proceedings of the 3rd Workshop on Planning and Learning in International Conference on Automated Planning and Scheduling. Freiburg, Germany, 2011 : 36-43.
5Thiebaux S, Hoffmann J, Nebel B. In defense of PDDL axioms. Artificial Intelligence, 2005, 168(1-2) : 38-69.
6Bacchus F, Kabanza F. Using temporal logics to express search control knowledge for planning. Artificial Intelligence, 2000, 116(1-2): 123-191.
7饶东宁,蒋志华,姜云飞,刘强.从规划解中学习一阶派生谓词规则[J].计算机学报,2010,33(2):251-266. 被引量：11
8Zettlemoyer L, Pasula H, Kaelbling L. Learning planning rules in noisy stochastic worlds/Proceedings of the 20th National Conference on Artificial Intelligence. Pittsburgh, USA, 2005.. 911-918.
9饶东宁,蒋志华,姜云飞,吴康恒.从WSBPEL程序中学习Web服务的不确定动作模型[J].计算机研究与发展,2010,47(3):445-454. 被引量：10
10Zhuo H, Yang Q, Hu D, Li L. Learning complex action models with quantifiers and logical implications. Artificial Intelligence, 2010, 174(18), 1540-1569.

二级参考文献89

1史玉良,黄光安,叶炜,张亮,施伯乐.基于任务依赖信息的Web服务自动合成[J].计算机研究与发展,2006,43(12):2110-2116. 被引量：8
2蒋志华,姜云飞.基于与状态无关的激活集的包含派生谓词的规划问题求解[J].计算机科学,2007,34(3):176-180. 被引量：3
3邱莉榕,史忠植,林芬,常亮.基于主体的语义Web服务自动组合研究[J].计算机研究与发展,2007,44(4):643-650. 被引量：27
4赖志锋,姜云飞.智能规划中基于遗传算法的动作模型学习[J].计算机学报,2007,30(6):945-953. 被引量：6
5Thiebaux S, Hoffmann J, Nebel B. In defense of PDDL axioms. Artifieial Intelligence, 2005, 168(1/2) : 38-69.
6Fikes R, Nilsson N. STRIPS: A new approach to the application of theorem proving to problem solving. Artificial Intelligence, 1971, 2(3/4) : 189-208.
7Thielscher M. Ramification and causality. Artificial Intelligence, 1997, 89(1/2):317-364.
8Yang Qing, Wu Kang-Hen, Jiang Yun-Fei. Learning action models from plan examples using weighted MAX-SAT. Artificial Intelligence, 2007, 171(2-3): 107-143.
9Amir E. Learning partially observable deterministic action models//Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI 2005). Edinburgh, Scotland, UK, 2005:1433-1439.
10Ilghami O, Munoz-Avila H, Nau D S, Aha D W. Learning preconditions for planning from plan traces and HTN struc ture. Journal of Artificial Intelligence Research, 2005, 21 (4) : 388-413.

共引文献18

1吕帅,刘磊,石莲,李莹.基于自动推理技术的智能规划方法[J].软件学报,2009,20(5):1226-1240. 被引量：22
2饶东宁,蒋志华,姜云飞,朱慧泉.对不确定规划中观测约简的进一步研究[J].软件学报,2009,20(5):1254-1268. 被引量：10
3吕帅,刘磊,李莹,石莲.基于模态逻辑D公理系统的Conformant规划方法[J].计算机研究与发展,2009,46(7):1160-1168. 被引量：6
4饶东宁,蒋志华,姜云飞,刘强.从规划解中学习一阶派生谓词规则[J].计算机学报,2010,33(2):251-266. 被引量：11
5饶东宁,蒋志华,姜云飞.规划领域定义语言的演进综述[J].计算机工程与应用,2010,46(22):23-25. 被引量：6
6蒋志华,饶东宁,姜云飞.通用规划综述[J].计算机应用研究,2010,27(12):4414-4418. 被引量：1
7蒋志华,饶东宁,姜云飞,朱慧泉.利用规划命题关系图构建目标议程和宏动作[J].软件学报,2011,22(1):44-56.
8边芮,姜云飞,吴向军,梁瑞仕.基于派生谓词的STRIPS领域知识提取策略[J].软件学报,2011,22(1):57-70.
9蒋志华,饶东宁,姜云飞,江洪.基于AI Planning的Parlay X电信业务设计[J].计算机学报,2011,34(2):304-317. 被引量：1
10饶东宁,蒋志华,姜云飞.多agent规划综述[J].计算机应用研究,2011,28(3):801-804. 被引量：2

同被引文献7

1徐英杰,顾琳琳,邓金鹏.人工智能技术在虚假会计信息发现机制中的应用[J].沈阳工业大学学报,2005,27(2):223-225. 被引量：7
2姚鑫骅,潘雪增,傅建中,陈子辰.数控系统的混合任务模型及其最优调度算法研究[J].浙江大学学报（工学版）,2006,40(8):1315-1319. 被引量：8
3严武军.师范院校计算机人工智能双语教学的方案设计[J].计算机教育,2007(10X):93-95. 被引量：6
4张亮,黄曙光,石昭祥,胡荣贵.基于LSTM型RNN的CAPTCHA识别方法[J].模式识别与人工智能,2011,24(1):40-47. 被引量：25
5姜志渊.论计算机人工智能在统计行业分类编码中的实现[J].统计科学与实践,2014(8):53-55. 被引量：1
6袁保立,孙红梅.计算机人工智能的发展源流及曙光——评《心智、语言和机器——维特根斯坦哲学和人工智能科学的对话》[J].当代教育科学,2014(15). 被引量：3
7饶东宁,李建华,蒋志华,赵淦森.并行概率规划综述[J].计算机应用研究,2016,33(6):1607-1611. 被引量：3

引证文献2

1周源.基于移动网络技术中人工智能的应用研究[J].张家口职业技术学院学报,2016,29(1):51-53. 被引量：4
2饶东宁,朱永亮,蒋志华.基于因果图启发式的并行概率规划求解[J].计算机应用研究,2018,35(5):1372-1379. 被引量：1

二级引证文献5

1饶东宁,陈境凯,马丹鹏,崔垣嫄.基于概率并行规划的自动物流仓储建模与调度[J].计算机应用研究,2020,37(S02):136-138. 被引量：4
2王铁.计算机网络技术中人工智能的应用分析[J].信息通信,2017,30(5):108-109. 被引量：3
3钟元权,韩高峰.物联网背景下人工智能机器人的发展趋势[J].黑河学院学报,2019,10(3):219-220. 被引量：2
4韦麟.5G移动通信技术在人工智能领域应用[J].通讯世界,2021,28(3):56-57.
5谭翠平.基于人工智能的5G无线网络优化[J].通讯世界,2021,28(7):175-176. 被引量：2

1饶东宁,蒋志华,姜云飞,刘强.从规划解中学习一阶派生谓词规则[J].计算机学报,2010,33(2):251-266. 被引量：11
2蒋志华,饶东宁,姜云飞,杨天奇.自动获取派生谓词规划领域的通用规划[J].计算机学报,2014,37(8):1820-1838. 被引量：1
3蒋志华,饶东宁,姜云飞,翁健.利用派生谓词和偏好处理OSP问题的目标效益依赖[J].软件学报,2012,23(3):439-450.
4饶东宁,蒋志华,姜云飞.在部分观测环境下的不确定动作模型学习[J].软件学报,2014,25(1):51-63. 被引量：2
5刘湘明.过程之美[J].商业价值,2013(4):120-120.
6音箱也3G：奥特蓝星T612音箱[J].新潮电子,2009(5):169-169.
7边芮,姜云飞,吴向军,梁瑞仕.基于派生谓词的STRIPS领域知识提取策略[J].软件学报,2011,22(1):57-70.
8刘强.任务驱动要“精确控制”——信息技术教学中任务驱动之我见[J].中国信息技术教育,2012(1):40-41.
9王建东.BACKGROUND KNOWLEDGE AND SECONDARY KNOWLEDGE BASES IN LEARNINGS YSTEMS[J].Transactions of Nanjing University of Aeronautics and Astronautics,1997,14(1).
10路红,费树岷,郑建勇,张涛.基于行为和部分观测的多目标跟踪(英文)[J].Journal of Southeast University(English Edition),2008,24(4):468-472.

计算机学报

2015年第7期

浏览历史

内容加载中请稍等...

在部分观测环境下学习规划领域的派生谓词规则被引量：2

参考文献28

二级参考文献89

共引文献18

同被引文献7

引证文献2

二级引证文献5

相关作者

相关机构

相关主题

浏览历史

在部分观测环境下学习规划领域的派生谓词规则 被引量：2

参考文献28

二级参考文献89

共引文献18

同被引文献7

引证文献2

二级引证文献5

相关作者

相关机构

相关主题

浏览历史

在部分观测环境下学习规划领域的派生谓词规则被引量：2