期刊文献+

基于错误驱动的语义文法自动扩展学习方法研究

Automatic Error-driven Learning Method of Semantic Grammar
下载PDF
导出
摘要 面向领域的自然语言理解技术是垂直搜索引擎、领域相关问答系统等应用的核心技术之一.本文在已构建的基于本体和语义文法的自然语言理解系统的基础上,提出一种基于错误驱动的语义文法自动扩展学习方法,对于解析错误的句子,利用核心文法生成部分解析树,按照打分函数选择一组最佳的部分解析树,利用预测模型预测部分解析树的上层节点并试图构建完整的解析树,从而学习得到新的文法规则,对于学习得到的不同类型的规则进行验证并更新核心文法库,通过对句子的可学习性度量来筛选学习对象,从而提高文法扩展学习的整体质量和效率.分别在两个不同规模的领域数据集进行了测试,在交互式学习范式下,测试对比了学习算法在不同规模领域的学习效率,在批量学习范式下,测试对比了更新后的文法和核心文法在两个领域数据集上的准确率和识别率等性能指标.实验结果表明,本文所提出的方法是有效的. Domain-specific natural language understanding technology is one of the core technology of vertical search engines,domain-specific question answering system and other applications.This research focus on a novel constrained semantic grammar and its automatic learning methods based on an existing domain-specific question answering system.An error-driven learning method of semantic grammar is proposed.The method first partially parses the ungrammatical sentence based on the core semantic grammar,then it attempts to build a complete parse tree,including predicting the top-level node of the partial parsing tree,generating and verifying hypotheses of new grammar rules.Learnability metrics is used to filter sentences in the training corpus to improve the overall quality and efficiency of grammar extending algorithm.The proposed algorithm is applied to two domains of different scales.In the interactive learning paradigm,learning efficiency are compared in different domains.In the batch learning paradigm,the paper compares the accuracy,MRR and recognition rate of the extended grammar and core grammar on twodatasets.The test results show that the proposed method is effective.
作者 王东升 王卫民 祁云松 王石 曹存根 WANG Dong-sheng;WANG Wei-min;QI Yun-song;WANG Shi;CAO Cun-gen(School of Computer Science,Jiangsu University of Science of Technology,Zhenjiang,Jiangsu 212003,China;Key Laboratory of Intelligent Information Processing,Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100190,China)
出处 《电子学报》 EI CAS CSCD 北大核心 2021年第2期248-259,共12页 Acta Electronica Sinica
基金 国家自然科学基金(No.61702234) 科技部重点研发项目(No.2017YFC1700302)。
关键词 语义文法 文法扩展 自然语言理解 领域 本体 semantic grammar grammar extending natural language understanding domain specific ontology
  • 相关文献

参考文献3

二级参考文献71

  • 1俞士汶,1991年
  • 2叶一民,中国科学.A,1990年,3期,314页
  • 3陈肇雄,中国科学.A,1989年,2期,186页
  • 4Tova M,Suciu D,Vianu V.Typechecking for XML transformers[J].Journal of Computer and System Sciences,2003,66(1):66-97.
  • 5Bj?rklund H,Martens W,Schwentick T.Optimizing conjunctive queries over trees using schema information.Proceedings of the International Symposium on Mathematical Foundations of Computer Science[C].Berlin Heidelberg:Springer,2008.132-143.
  • 6Halevy A,Rajaraman A,Ordille J.Data integration:the teenage years.Proceedings of the VLDB[C].New York:ACM Press,2006.9-16.
  • 7Barbosa D,Mignet L,Veltri P.Studying the XML web:Gathering statistics from an XML sample[J].World Wide Web,2006,9(2):187-212.
  • 8Martens W,Neven F,Schwentick T and Bex G J.Expressiveness and complexity of XML Schema[J].ACM Transactions on Database Systems,2006,31(3):770-813.
  • 9Grijzenhout S,Marx M.The quality of the XML web[J].Web Semantics:Science,Services and Agents on the World Wide Web,March 2013,19:59-68.
  • 10Florescu D.Managing semi-structured data[J].ACM Queue,2005,3(8):18-24.

共引文献27

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部