期刊文献+

受限领域中最长地点实体提及的提取研究 被引量:1

Research on Extraction of Maximal Location Entity Mention Based on Limited Field
下载PDF
导出
摘要 实体是构成事件信息的基本单元,在事件中扮演着重要的角色。在自然语言处理领域,实体识别是信息提取、句法分析、机器翻译、篇章理解等应用领域重要的基础性工具。汉语句法成分特有的套叠现象决定了实体表达的复杂性,增加了识别的难度。这使得已有的用于命名实体识别中的研究方法在长地点实体的识别中不能取得好的效果。为研究自动提取实体方法,文章从事件报道领域出发,以最长地点实体为对象,对325篇新闻语料进行地点实体标注和抽取,分析、研究了地点实体的出现特征,并根据分析结论提出实体提取可行方案。 Entities are basic units of event information, and playing an important role in event. In the field of natural language processing, entity recognition is the key technique in many Chinese information processing applications such as in formation extraction, syntactic analysis, machine translation, text comprehension and so on. Special nesting phenomena of Chinese constituents determine the complexity of the entity, and there are many kinds of expression in the location entity, and the methods of the named entity recognition can't get a good result in the location entity recognition. So, in order to auto extract location entity, this paper artificially annotated 325 news, and statistically analyse appear characteristics of this location. Based on the result of analyze, a viable extract method is developed.
作者 高燕 刘娟
出处 《计算机与数字工程》 2011年第7期72-74,165,共4页 Computer & Digital Engineering
基金 广东省自然科学基金项目(编号:9151027501000039)资助
关键词 实体 事件 最长地点实体 提取 entity event maximal location entity extraction
  • 相关文献

参考文献11

  • 1Lu Jian-ming. Special nesting phenomena of Chinese constituents[C]//The Optional Paper Of Lu Jian-ming ZhengZhou, He'nan Education Press, 1993 : 174-192.
  • 2周强,孙茂松,黄昌宁.汉语最长名词短语的自动识别[J].软件学报,2000,11(2):195-201. 被引量:37
  • 3黄昌宁,林娟,孙承杰.何谓金本位[C]//自然语言理解与大规模内容计算-全国第八届计算语言学联合学术会议(JSCL-2005)论文集,北京:清华大学出版社,2005:ll-20.
  • 4赵军.基于转换的汉语基本名词短语识别模型[D].清华大学博士论文,1998.
  • 5LiWen-jie, Zhou Ming, Pan Hai-hua, et al. Corpus-based maximal-length Chinese noun phrase extraction [C]//Chen Li-wei, Yuan Qi eds Advances and Appli-cations on Computational Linguistics Beijing: Tsinghua University Press, 1995 : 119-124.
  • 6ACE. ACE Chinese Annotation Guidelines {or Entities (Version 5.5)[EB/OL]. http://www, ldc. upenru edu/ proj ects/ACE/docs/Chinese-Entities-Guidelines_v5. 5. pdf. 2005a.
  • 7ACE. ACE Chinese Annotation Guidelines for Events [EB/OL].
  • 8郑家恒,谭红叶,王兴义.基于模式匹配的中文专有名词识别[C]//第十-届全国民族语言文字信息学术研讨会论文集,2007.
  • 9孙宏林,俞士汶.浅层句法分析方法概述[J].当代语言学,2000,2(2):74-83. 被引量:38
  • 10黄德根,杨元生,王省,张艳丽,钟万勰.基于统计方法的中文姓名识别[J].中文信息学报,2001,15(2):31-37. 被引量:34

二级参考文献36

  • 1孙茂松,黄昌宁,高海燕,方捷.中文姓名的自动辨识[J].中文信息学报,1995,9(2):16-27. 被引量:87
  • 2吴胜远.一种汉语分词方法[J].计算机研究与发展,1996,33(4):306-311. 被引量:49
  • 3周强.一个汉语短语自动界定模型[J].软件学报,1996,7(A00):315-322. 被引量:9
  • 4孙茂松,黄昌宁,邹嘉彦,陆方,沈达阳.利用汉字二元语法关系解决汉语自动分词中的交集型歧义[J].计算机研究与发展,1997,34(5):332-339. 被引量:66
  • 5吴胜远.并行分词方法的研究[J].计算机研究与发展,1997,34(7):542-545. 被引量:13
  • 6Abney, 1996b. Partial parsing via finite-state cascades. In Proceedings of the ESSLLI '96 Robust Parsing Workshop.
  • 7Argamon, S., I. Dagon and Y. Krymolowsky. 1998. A memory-based approach to learning shallow natural language patterns. In Proceedings of COLING-ACL '98. Pp. 67-73.
  • 8Brill, Eric. 1995. Unsupervised learning of Disambiguation Rules for part of speech tagging. In Proceedings of the 3rd Workshop on Very Large Corpora. Pp. 1-13.
  • 9Cardie, Claire and David Pierce. 1998. Error-driven pruning of treebank grammars for base noun phrase identification. In Proceedings of COLING-ACL '98. Pp. 218-224.
  • 10Chen, Kuang-hua and Chen, Hsin-Hsi. 1994. Extracting noun phrases from large-scale texts: a hybrid approach and its automatic evaluation. In Proceedings of the 32nd Annual Meeting of the Association for Computational binguistics. Pp. 234-241.

共引文献104

同被引文献4

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部