期刊文献+

基于维基百科和模式聚类的实体关系抽取方法 被引量:23

A Entity Relation Extraction Method Based on Wikipedia and Pattern Clustering
下载PDF
导出
摘要 该文提出了一种基于维基百科和模式聚类的方法,旨在从开放文本中抽取高准确率的中文关系实体对。首次使用从人工标注知识体系知网到维基百科实体映射的方式获取关系实例,并且充分利用了维基百科的结构化特性,该方法很好地解决了实体识别的问题,生成了准确而显著的句子实例;进一步,提出了显著性假设和关键词假设,在此基础上构建基于关键词的分类及层次聚类算法,显著提升了模式的可信度。实验结果表明该方法有效提升了句子实例及模式的质量,获得了良好的抽取性能。 This paper proposes a method to extract Chinese entity relations of high accuracy from open text based on Wikipedia and pattern clustering.We get relation instances by a mapping from HowNet to Wikipedia and via the structural characteristics of Wikipedia.Based on these,the method solves the entity recognition and generates significant sentence instances.Furthermore,significance assumption and keyword assumption are proposed to support classification and hierarchy clustering algorithm for pattern reliability.The results show that the method achieves a good performance with high-quality seeds and patterns.
出处 《中文信息学报》 CSCD 北大核心 2012年第2期75-81,127,共8页 Journal of Chinese Information Processing
基金 国家自然科学基金重大研究计划培育项目(90920010) 国家863计划重点项目(2008AA01Z145)
关键词 关系抽取 维基百科 模式聚类 relation extraction Wikipedia pattern clustering
  • 相关文献

参考文献15

  • 1O. Medelyan, D. Milne, C. Legg, et al. Mining Meaning from Wikipedia[J].International Journal of Human-Computer Studies,September 2009,67 (9):716-754.
  • 2E.Agichtein,L.Gravano.Snowball:Extracting Relations from Large Plain-Text Collections[C]//Proceedings of the fifth ACM conference on Digital libraries.New York,NY,USA:ACM,2000:85-94.
  • 3M.Ruiz-Casado,E.Alfonseca,P.Castells.Automatic Extraction of Semantic Relationships for WordNet by Means of Pattern Learning from Wikipedia[J].Natural Language Processing and Information Systems 2005,3513:233-242.
  • 4Y.Yan,N.Okazaki,Y.Matsuo,et al.Unsupervised Relation Extraction by Mining Wikipedia Texts Using Information from the Web[C]//Proceeding of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP:Volume 2-Volume 2.
  • 5P. Pantel,M. Pennacchiotti. Espresso:Leveraging Generic Patterns for Automatically Harvesting Semantic Relations[C]//Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics,2006:113-120.
  • 6F. M. Suchanek,G. Ifrim,G. Weikum. LEILA:Learning to Extract Information by Linguistic Analysis[J].ACL,2006:18-25.
  • 7G.Wang,Y.Yu,H.Zhu.PORE:Positive-Only Relation Extraction from Wikipedia Text.Lecture Notes in Computer Science[C]//Proceedings of Lecture Notes in Computer Science,2007,Volume 4825:580-594.
  • 8Kilgarriff,J.Rosenzweig.English SENSEVAL:Report an Results.[C]//Proceedings of the 2nd International Conference on Language Resourcesand Evaluation,LREC,Athens,Greece.2000.
  • 9J.X.Chen,D. H.Ji,C.L.Tan,et al.Unsupervised Feature Selection for Relation Extraction[C]//IJCNLP,2005.
  • 10F.M.Suchanek,G.Kasneci,G.Weikum.YAGO:A Core of Semantic Knowledge Unifying WordNet and Wikipedia[J]. Proceeding WWW '07 Proceedings of the 16th international conference on World Wide Web,2007:697-706.

同被引文献265

引证文献23

二级引证文献127

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部