期刊文献+

基于自监督学习的维基百科家庭关系抽取 被引量:1

Family relation extraction from Wikipedia by self-supervised learning
下载PDF
导出
摘要 传统有监督的关系抽取方法需要大量人工标注的训练语料,而半监督方法则召回率较低,对此提出了一种基于自监督学习来抽取人物家庭关系的方法。该方法首先将中文维基百科的半结构化信息——家庭关系三元组映射到自由文本中,从而自动生成已标注的训练语料;然后,使用基于特征的关系抽取方法从中文维基百科的文本中获取人物间的家庭关系。在一个人工标注的家庭关系网络测试集上的实验结果表明,该方法优于自举方法,其F1指数达到77%,说明自监督学习可以较为有效地抽取人物家庭关系。 Traditional supervised relation extraction demands a large scale of manually annotated training data while semisupervised learning suffers from low recall. A self-supervised learning based approach was proposed to extract personal family relationships. First, semi-structured information( family relation triples) was mapped to the free text in Chinese Wikipedia to automatically generate annotated training data. Then family relations between person entities were extracted from Wikipedia text with feature-based relation extraction method. The experimental results on a manually annotated test family network show that this method outperforms Bootstrapping with F1-measure of 77%, implying that self-supervised learning can effectively extract personal family relationships.
出处 《计算机应用》 CSCD 北大核心 2015年第4期1013-1016,1020,共5页 journal of Computer Applications
基金 国家自然科学基金资助项目(61373096 90920004) 江苏省高校自然科学研究重大项目(11KJA520003)
关键词 自监督学习 维基百科 半结构化信息 关系抽取 self-supervised learning Wikipedia semi-structured information relation extraction
  • 相关文献

参考文献17

  • 1ZHOU G, ZHANG M. Extracting relation information from text doc- uments by exploring various types of knowledge [ J]. Information Processing and Management, 2007, 43(4): 969-982.
  • 2ZHOU G, QIAN L, FAN J. Tree kernd-based semantic relation ex- traction with rich syntactic and semantic information [ J]. Informa- tion Science, 2010, 180(8) : 1313 - 1325.
  • 3OH J-H, UCHIMOTO K, TORISAWA K. Bilingual co-training for monolingual hyponymy-relation acquisition [C]// ACL 2009: Pro- ceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Lan- guage. Stroudsburg: Association for Computational Linguistics, 2009:432-440.
  • 4ZHU Z. Weakly-supervised relation classification for information ex- traction [ C]// Proceedings of the Thirteenth ACM International Conference on Information and Knowledge Management. New York: ACM Press, 2004:581-588.
  • 5ZHANG M, SU J, WANG D, et al. Discovering relations between named entities from a large raw corpus using tree similarity-based clustering [ C]//Proceedings of the Second International Joint Con- ference on Natural Language Processing. Berlin: Springer-Verlag, 2005:378-389.
  • 6JING H, KAMBHATLA N, ROUKOS S. Extracting social networks and biographical facts from conversational speech transcripts [ C]// Proceedings of the 48th Annual Meeting of the Association for Com- putational Linguistics. Stroudsburg: Association for Computational Linguistics, 2007:1040 - 1047.
  • 7AGARWAL A, CORVALAN A, JENSEN J, et al. Social network analysis of Alice in Wonderland [ C]// Proceedings of the NAACL- HLT 2012 Workshop on Computational Linguistics for Literature. Stroudsburg: Association for Computational Linguistics, 2012:88 - 96.
  • 8GU J, HU Y, QIAN L, et al. Research on building family networks based on bootstrapping and coreference resolution [ C]// NLPCC 2013: Proceedings of the Second CCF Conference on Natural Lan- guage Processing and Chinese Computing, CCIS 400. Berlin: Springer-Verlag, 2013:200 -211.
  • 9KOMACHI M, KUDO T, SHIMBO M, et al. Graph-based analysis of semantic drift in Espresso-like bootstrapping algorithms [ C]// EMNLP 2008: Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2008:1011 - 1020.
  • 10MINTZ M, BILLS S, SNOW R, et al. Distant supervision for rela- tion extraction without labeled data [ C]//ACL 2009: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Process- ing of the AFNLP. Stroudsburg: Association for Computational Lin- guistics, 2009:1003 - 1011.

同被引文献24

引证文献1

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部