面向结构的基于学习的规划方法被引量：1

Structure-Oriented Learning-Based Planning Method

下载PDF

导出

摘要近年来,规划中的学习问题重新受到了关注.如何通过学习机制改善现有规划器,使其能够可靠而令人信服地超越现有非学习的规划器的能力,仍然是一个尚未解决的难题.提出了面向规划问题和解的结构的基于学习的规划技术.该方法将先验知识表示成"子问题-规划片段"的形式.每次规划器成功找到解以后,根据问题的初始状态和目标状态,构造规划对象的初始子状态和目标子状态,构成子问题,并从规划解中抽取该子问题对应的规划片段.这些先验知识将被唯一记录并保存成先验知识库.新问题的求解首先从先验知识库中检索与当前求解问题相关的先验知识;然后,将这些先验知识经过例化、合并步骤后编码成句子;最后,将这些句子连同问题编码得到的句子作为SAT求解器的输入,实现最终解的确定.实验使用了IPC中的基准测试例子进行测试.实验结果表明,SOLP算法求解速度与传统非学习的规划器相比具有明显优势,最佳情况下可达约80%的效率提升. The goal of reliably outperforming non-learning planners via learning is still to be achieved. A novel structure-oriented learning-based planning method （SOLP） is presented. SOLP anaylyses the structure knowledge, decomposes the planning problem into initial sub-state and goal sub-state, its solution into plan fragment, when planner finds out a solution successfully. The structure knowledge from previous experiment, or prior knowledge, will be saved in domain. When encountering new problem, SOLP firstly recalls the prior problem structure equivalent or similar to the current problem and the corresponding plan fragment from the domain file, then instantiates the learned prior knowledge as ground knowledge, and finally, encodes the ground knowledge as a satisfiability clause. These clauses, together with the set of clauses from the problem, form the input of the algorithm. SOLP calls the SAT Solver to determine the final solution. An experiment is conducted to test the algorithm in several different domains from IPC to demonstrate the efficiency and effectiveness of the new approach. The results show that, the speed of SOLP has obvious advantage than that of non-learning planner, with up to 80% improvement in extreme case.

作者陈蔼祥姜云飞柴啸龙边芮陈清亮

机构地区广东财经大学数学与统计学院中山大学软件研究所广东财经大学公共管理学院暨南大学计算机系

出处《软件学报》 EI CSCD 北大核心 2014年第8期1743-1760,共18页 Journal of Software

基金国家重点基础研究发展计划(973)(2005CB321902 2010CB328103) 国家自然科学基金(60773201 61003056) 广东省自然科学基金(10451032001006140) 广州市科技和信息化局应用基础研究计划(2010Y1-C641) 广东省教育厅高校优秀青年创新人才培育项目(LYM10081 LYM_0065) 中央高校基本科研业务费专项资金(21612414) 广东省教育厅科技创新项目(2013kjcx0086) 广东财经大学自然科学研究项目(11BS52001)

关键词问题结构解结构规划片段结构知识学习 problem structure solution structure plan fragment structure knowledge learning

分类号 TP181 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

参考文献29

1Rosa TDL, Jim6nez S, Fuentetaja R, Borrajo D. Scaling up heuristic planning with relational decision trees. Journal of Artificial Intelligence Research, 2011,40(4):767-813. [doi: 10.1613/jair.3231].
2Fern A, Yoon S, Givan R. Approximate policy iteration with a policy language bias: Solving relational Markov decision processes. Journal of Artificial Intelligence Research, 2006,25(1):75-118. [doi: 10.1613/jair.1700].
3Fox M, Long D, Magazzeni D. Plan-Based policy-learning for autonomous feature tracking. In: Proc. of the 22rid Int'l Conf. on Automated Planning and Scheduling (ICAPS 2012). AAAI Press, 2012.38-46.
4Yoon S, Fern A, Givan R. Learning control knowledge for forward search planning. The Journal of Machine Learning Research, 2008,9:683-718.
5Botea A, Enzenberger M, Miiller M, Schaeffer J. Macro-FF: Improving AI planning with automatically learned macro-operators. Journal of Artificial Intelligence Research, 2005,24(10):581-621. [doi: 10.1613/jair.1696].
6Gerevini A, Saetti A, Vallati M. Exploiting macro-actions and predicting plan length in planning as satisfiability. In: Proc. of the AI * IA 2011. Berlin, Heidelberg: Springer-Verlag, 2011.189-200.
7Newton M. Wizard: Learning macro-actions comprehensively for planning [Ph.D. Thesis]. Department of Computer and Information Science, University of Strathclyde, 2008.
8Bibai J, Sav6ant P, Schoenauer M, Vidal V. An evolutionary metaheuristic based on state decomposition for domain-independent satisficing planning. In: Proc. of the ICAPS. AAAI Press, 2010. 18-25.
9Vidal V, Geffner H. Branching and pruning: An optimal temporal POCL planner based on constraint programming. Artificial Intelligence, 2006,170(3):298-335. [doi: 10.1016/j.artint.2005.08.004].
10Dr6o J, Sav6ant P, Schoenaner M, Vidal V. Divide-and-Evolve: The marriage of descartes and darwin. In: Proc. of the 7th Int'l Planning Competition----The Deterministic Part. Menlo Park: AAAI Press, 2011.29-30.

同被引文献6

1斯坦普.信息安全原理与实践[M].北京:清华大学出版社,2013.
2姚宏宇,田溯宁.云计算:大数据时代的系统工程[M].北京:电子工业出版社,2015.
3陶志,卞文静.基于先验概率优势关系的粗糙决策分析模型[J].中国民航大学学报,2013,31(4):60-64. 被引量：6
4金澈清,钱卫宁,周敏奇,周傲英.数据管理系统评测基准:从传统数据库到新兴大数据[J].计算机学报,2015,38(1):18-34. 被引量：68
5杜德慧,程贝,刘静.面向安全攸关系统中小概率事件的统计模型检测[J].软件学报,2015,26(2):305-320. 被引量：10
6徐丙凤,黄志球,胡军,魏欧,李伟湋.一种状态事件故障树的时间特性分析方法[J].软件学报,2015,26(2):427-446. 被引量：10

引证文献1

1张凯.基于最优决策树的空管信息网络安全评估方法[J].企业科技与发展,2016(2):23-26.

1蓝红兵,费奇.模型管理系统中问题求解策略的研究[J].华中理工大学学报,1990,18(5):29-36.
2Esther Olayinka Bamigbola.Complex Sequence of the English Nominal Group[J].Sino-US English Teaching,2015,12(4):271-281.
3陈蔼祥,姜云飞,胡桂武,柴啸龙,边芮.基于学习的规划技术研究[J].计算机科学,2011,38(1):15-19.
4严艳丽.我用多媒体技术教数学[J].陕西教育（教学）,2009(5):62-62. 被引量：1
5程波.四种机制改善基于工作流的OA[J].中国教育网络,2009(6):58-60.
6王丽芳,王珺吉,蒋泽军.基于消息代理的数据集成框架研究[J].微电子学与计算机,2007,24(1):190-192. 被引量：3
7郑燕玲.一种基于信息熵的空间聚类算法[J].微电子学与计算机,2011,28(8):225-227.
8戴闽鲁,金学广,周玉坤,小岛正人,陶涛,白栋,王章敏,江南.CMMB无线网络的测试与分析[J].广播电视信息,2008,15(9):50-51. 被引量：1
9陈玉坤,计元.以问题结构为基础的递归程序设计[J].小型微型计算机系统,2001,22(8):989-991. 被引量：6
10刘寒冰,李福荣,叶茂功.N皇后问题的回溯算法改进[J].软件导刊,2010,9(7):63-65. 被引量：7

软件学报

2014年第8期

浏览历史

内容加载中请稍等...

面向结构的基于学习的规划方法被引量：1

参考文献29

同被引文献6

引证文献1

相关作者

相关机构

相关主题

浏览历史

面向结构的基于学习的规划方法 被引量：1

参考文献29

同被引文献6

引证文献1

相关作者

相关机构

相关主题

浏览历史

面向结构的基于学习的规划方法被引量：1