期刊文献+

中文开放式多元实体关系抽取 被引量:13

N-ary Chinese Open Entity-relation Extraction
下载PDF
导出
摘要 传统信息抽取针对特定的领域。当转换到新领域时,需要人工编写新的抽取规则和人工标记新的训练样本。开放信息抽取突破了传统信息抽取的局限性。现有的开放式信息抽取系统大多针对英文,然而,目前对于中文的研究相对较少,并主要以抽取三元组为主,没有针对中文抽取多元组的方法。因此提出了一种基于依存分析的中文开放式多元实体关系抽取方法。首先,对文本集进行预处理和依存关系分析;然后将动词视为候选关系词,将与此动词有满足条件的有效依存路径的基本名词短语视为实体词,关联两个及两个以上的实体词的关系词可与实体词组成候选多元实体关系组;最后,使用经过训练的逻辑回归分类器对多元实体关系组进行过滤。对百度百科数据集的抽取结果显示,所提方法在抽取大量实体关系多元组时准确性可达到81%。 Traditionally,information extraction(IE)has focused on satisfying precise,narrow,pre-specified requests from small homogeneous corpora.Shifting to a new domain requires the user to name the target relations and to manually create new extraction rules or hand-tag new training examples.Open information extraction(OIE)overcomes the limitations of traditional IE techniques,which trains individual extractors for every single relation type.Present studies have attracted much attention on English OIE.However,few studies have been reported on OIE for Chinese.This paper presented a N-ary Chinese OIE system(N-COIE).N-COIE preprocesses the sentences using the nature language processing tools,and then extracts entity-relation groups from the preprocessed sentences.Finally,N-COIE filters entityrelation groups using the trained logistic regression classifier.Empirical results show the effectiveness of the proposed system.
出处 《计算机科学》 CSCD 北大核心 2017年第S1期80-83,共4页 Computer Science
基金 基于框架语义标注的中文篇章指代消解策略研究(2012011011-2)资助
关键词 中文开放式信息抽取 依存分析 实体关系抽取 机器学习 OIE word2vec Chinese open information extraction Dependency parsing Entity-relation extraction Machine learning OIE Word2vec
  • 相关文献

参考文献2

二级参考文献53

  • 1Oren Etzioni,Michele Banko,Michael J.Cafarella.Machine reading[C]//Proceedings of AAAI Conference on Artificial Intelligence,2006.
  • 2K Barker,B Agashe,S Chaw,et al.Learning by reading:A prototype system,performance baseline and lessons learned[C]//Proceedings of 22nd National Conference of Artificial Intelligence,2007.
  • 3O Etzioni,M Cafarella,D Downey,et al.Unsupervised named-entity extraction from the web; An experimental study[J].Artificial Intelligence,2005,165 (1):91-134.
  • 4Michele Banko,Michael J Cafarella,Stephen Soderland,et al.Open information extraction from the web[C]//Proceedings of IJCAI,2007.
  • 5Michele Banko,Oren Etzioni.The tradeoffs between open and traditional relation extraction[C]//Proceedings of Annual Meeting of the Association for Computational Linguistics,2008.
  • 6F Wu,D S Weld.Open information extraction using Wikipedia[C]//Proceedings of Annual Meeting of the Association for Computational Linguistics,2010:118-127.
  • 7Fei Wu,Daniel S Weld.Automatically semantifying Wikipedia[C]//Proceedings of the 16th Conference on Information and Knowledge Management,2007.
  • 8Anthony Fader,Stephen Soderland,Oren Etzioni.Ⅰ-dentifying relations for open information extraction[C]//Proceedings of Conference on Empirical Methods in Natural Language Processing,2011.
  • 9Oren Etzioni,Anthony Fader,Janara Christensen,et al.Open information extraction:the second generation[C]//Proceedings of International Joint Conference on Artificial Intelligence,2011.
  • 10Mausam,Michael Schmitz,Robert Bart,Stephen Soderland,Oren Etzioni.Open Language Learning for Information Extraction[C]//Proceedings of Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CONLL),2012.

共引文献70

同被引文献122

引证文献13

二级引证文献175

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部