摘要
现阶段的语义解析方法大部分都基于组合语义,这类方法的核心就是词典。词典是词汇的集合,词汇定义了自然语言句子中词语到知识库本体中谓词的映射。语义解析一直面临着词典中词汇覆盖度不够的问题。针对此问题,该文在现有工作的基础上,提出了基于桥连接的词典学习方法,该方法能够在训练中自动引入新的词汇并加以学习,为了进一步提高新学习到的词汇的准确度,该文设计了新的词语—二元谓词的特征模板,并使用基于投票机制的核心词典获取方法。该文在两个公开数据集(WebQuestions和Free917)上进行了对比实验,实验结果表明,该文方法能够学习到新的词汇,提高词汇的覆盖度,进而提升语义解析系统的性能,特别是召回率。
Current Semantic Parsers are mainly based on compositional semantics,with a strong dependent on lexicon. Lexicon is a set of vocabularies,which define the mappings between words or phrases from natural language sentences and predicates from knowledge base ontology. In order to deal with the low-coverage of lexicon.this paper proposes a bridge based lexicon learning method for semantic parsing on the basis of existing work. This method can bring in new vocabularies during training and learn a new lexicon with high-coverage. Furthermore,this paper designs a new word-predicate feature template and utilizes voting to gain core vocabularies for more accurate lexicon. Experiments results on two benchmarks: WebQuestions and Free917 , show that our method can learn new vocabularies for improving the coverage of lexicon, with a side-effect on parsing improvement.
作者
陈波
孙乐
韩先培
CHEN Bo;SUN Le;HAN Xianpei(Chinese Informatoin Processing Laboratory,Institute of Software,Chinese Academy of Sciences,Bejing,100190,China)
出处
《中文信息学报》
CSCD
北大核心
2019年第5期24-30,共7页
Journal of Chinese Information Processing
基金
国家自然科学基金(61433015
61572477)
关键词
语义解析
词典学习
组合语义
覆盖度
semantic parsing
lexicon learning
compositional semantics
coverage