摘要
实体关系抽取是信息抽取的重要组成部分.描述了一种融合多信息的实体语义关系抽取方法,充分利用中文的各种特征和信息来提高关系抽取的性能.该方法主要结合特征向量和树核函数两种方法;特征向量表示了文本的语言信息,树核方法表示了文本的结构化信息.并且在2005年的自主内容抽取(automatic content extraction,ACE)基准语料上进行关系检测和6个关系大类抽取的实验.实验结果表明,该方法能识别出大部分的非关系实例,各种关系类型识别的精确率和召回率也有一定提高.
Entity relation extraction is an important part in inforamtion extraction.This paper describes a multi-information fusion approach to entity semantic relation extraction,whose performance is improved by making full use of the various features and information in Chinese.Our approach combines the merits of both feature based method and tree kernel based method.Feature based method captures the language information of the text,on the other side,the tree kernel based method shows the structured information of the text.We do experiments on the automatic content extraction(ACE) 2005 benchmark corpus for relation detection and the identification of 6 relation types.The experimental results show that our model can identify the majority of the non-relational instances and also enhances the precision and the recall rate on the identification of various relation types.
出处
《厦门大学学报(自然科学版)》
CAS
CSCD
北大核心
2011年第3期540-545,共6页
Journal of Xiamen University:Natural Science
基金
国家自然科学基金项目(60803078)
教育部留学回国人员科研启动基金
关键词
关系抽取
特征
树核函数
relation extraction
feature
tree kernel