期刊文献+

开放式文本信息抽取 被引量:61

Open Information Extraction
下载PDF
导出
摘要 信息抽取研究已经从传统的限定类别、限定领域信息抽取任务发展到开放类别、开放领域信息抽取。技术手段也从基于人工标注语料库的统计方法发展为有效地挖掘和集成多源异构网络知识并与统计方法结合进行开放式信息抽取。该文在回顾文本信息抽取研究历史的基础上,重点介绍开放式实体抽取、实体消歧和关系抽取的任务、难点、方法、评测、技术水平和存在问题,并结合课题组的研究积累,对文本信息抽取的发展方向以及在网络知识工程、问答系统中的应用进行分析讨论。 The research on information extraction is being developed into open information extraction,i.e.extracting open categories of entities,relations and events from open domain text resources.The methods used are also transferred from pure statistical machine learning model based on human annotated corpora into statistical learning model incorporated with knowledge bases mined from large-scaled and heterogeneous Web resources.This paper firstly reviews the history of the researches on information extraction,then detailedly introduces the task definitions,difficulties,typical methods,evaluations,performances and the challenges of three main open domain information extraction tasks,i.e.entity extraction,entity disambiguation and relation extraction.Finally,based on our researches on this field,we analyze and discuss the development directions of open information extraction research and its applications in large-scaled knowledge engineering,question answering,etc.
出处 《中文信息学报》 CSCD 北大核心 2011年第6期98-110,共13页 Journal of Chinese Information Processing
基金 国家自然科学基金资助项目(60875041 61070106)
关键词 开放式信息抽取 知识工程 文本理解 open information extraction knowledge engineering text understanding
  • 相关文献

参考文献65

  • 1Ralph Grishman. 1997. Information Extraction : Tech- niques and Challenges[R]. New York: New York U-niversity, 1997.
  • 2Ralph Grishman, Beth Sundheim. Message Under- standing Conference-6: A Brief History[C]//Proceed- ings of COLING, 1996.
  • 3http://www, itl. nist. gov/iad/mig/tests/ace/[OL].
  • 4http ://www. nist. gov/tac/[OL].
  • 5Martina Naughton, N. Kushmerichand J. Carthy. Event Extraction from Hetergeneous News Sources [C]//Proceedings of AAAI, 2006.
  • 6D. McClosky, M. Surdeanu, C. D. Manning. Event Extraction as Dependency Parsing[C]//Proceedings ofACL-HLT, 2011.
  • 7Yu Hong, Jianfeng Zhang, Bin Ma, Jianmin Yao, Gu- odong Zhou, Qiaoming Zhu. Using Cross-Entity Infer ence to Improve Event Extraction[C]//Proeeedings ofACL-HLT, 2011.
  • 8赵军.命名实体识别、排歧和跨语言关联[J].中文信息学报,2009,23(2):3-17. 被引量:50
  • 9Jun Zhao, Feifan Liu. Product Named Entity Recog nition in Chinese Texts[J]. International Journal of Language Resource and Evaluation. 2008, 42 (2) :132- 152.
  • 10Richard C. Wang, William Cohen. Automatic Set In- stance Extraction using the Web[C]//Proceedings of ACL-IJCNLP, 2009.

二级参考文献67

  • 1孙茂松,黄昌宁,高海燕,方捷.中文姓名的自动辨识[J].中文信息学报,1995,9(2):16-27. 被引量:87
  • 2蒋龙,周明,简立峰.利用音译和网络挖掘翻译命名实体[J].中文信息学报,2007,21(1):23-29. 被引量:11
  • 3NIST. The ACE 2007 (ACE07) Evaluation Plan: Evaluation of the Detection and Recognition of ACE Entities, Values, Temporal Expressions, Relations, and Events [EB/OL]. [-2007]. http://www, hist. gov/ speech/tests/ace/2OOT/doc/aceOT-evalplan, vl. 3a. pdf.
  • 4Nancy A. Chinchor. Overview of MUC-7/MET-2[C]//Proceedings of the Seventh Message Under- standing Conference (MUC-7), Fairfax, Virginia, 1998.
  • 5Gina Anne Levow. The Third International Chinese Language Processing Bakeoff: Word Segmentation and Named Entity Recognition[C]//Proceedings of the Fifth SigHAN Workshop on Chinese Language Processing, Sydney: Association for Computational Lin- guistics, 2006:108 117.
  • 6A. Mikheev, C. Grover, Moens M. Description of the LTG System Used for MUC-7[C]//Proceedings of 7th Message Understanding Conference ( MUC-7 ), Fairfax, Virginia, 1998.
  • 7863计划中文信息处理与智能人机接口技术评测组.2004年度863计划中文信息处理与智能人机交互技术评测:命名实体评测结果报告[R].北京:863计划中文信息处理与智能人机接口技术评测组,2004.
  • 8Ralph Grishman, Beth Sundheim. Design of the MUC-6 evaluation [C]//Proceedings of 6th Message Under- standing Conference, Columbia, MD, 199S.
  • 9G. R. Krupka, K. Hausman. IsoQuest. Inc.:Description of the NetOwl TM Extractor System as Used for MUC-7 [C]//Proceedings of the 7th Message Understanding Conference. (MUC-7), Fairfax, Virginia, 1998.
  • 10W.J. Black, F. Rinaldi, D. Mowart. FACILE: Description of the NE System Used for MUC-7 [C]// Proceedings of the 7th Message Understanding Conference. (MUC-7), Fairfax, Virginia, 1998.

共引文献49

同被引文献778

引证文献61

二级引证文献1689

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部