期刊文献+

基于混合策略的英语基本名词短语识别——边界统计和词性串规则校正相结合的策略 被引量:2

English Base Noun Phrase Identification Based on Hybrid Strategy-- The Strategy of Combination of Boundary Statistic and the Amendment of the String of Part of Speech
下载PDF
导出
摘要 基本名词短语识别是自然语言处理领域非常重要的子任务。文中总结了一些有代表性的基本名词短语识别方法,并对多种典型英语基本名词短语识别的结果进行了比较和对照,提出并实现了边界统计和词性串校正相结合的英语基本名词短语识别方法。该方法把基本名词短语识别分成主次分明的两部分,边界统计作为主要部分能够正确识别出大部分基本名词短语,词性串规则作为辅助手段在对前者识别出的基本名词短语进行核对和校正的同时还对边界统计方法遗漏的基本名词短语进行再回收。此方法中,词性串规则弥补了边界统计无法顾及基本名词短语内部组合规律的缺点,提高了精确率和召回率。采用此方法,基本名词短语识别的精确率达到96.22%,召回率97.59%,Fβ=196.90%,F值超出了目前报道的最好结果。 Base noun phrase identification is an important sub -task in natural language processing.Representative methods of base noun phrase identification are summarized in this paper,whose results are compared and analyzed.A novel method of base noun phrase identification is proposed which combines boundary statistic and the amendment by the string of part of speech.The method divides the base noun phrase identification task into two parts.As the primary part,boundary statistic method can correctly identify most of the base noun phrases.The rules serve as the secondary part,which is composed of a string of part of speech tags.The rules make amendments to the base noun phrase identified by the primary part,at the same time recycle the base noun phrases which are neglected by the primary part,thus enhancing both the precision and recall.The secondary part of the method remedies the primary part by taking into account the interior constitution of base noun phrase.The method reaches a precision of96.22%and recall of97.59%in English base noun phrase identification,whose F β=1 reaches96.90%.Compared to other method the method achieves the highest F score.
出处 《计算机工程与应用》 CSCD 北大核心 2004年第35期1-3,121,共4页 Computer Engineering and Applications
基金 国家自然科学基金(编号:60302021 60375019) 国家863高技术研究发展计划项目(子课题)(编号:2002AA117010-09) 科技部政府间国际合作项目(编号:CI-2003-03)资助
关键词 基本名词短语识别 英语 混合策略 语块 边界统计 词性串规则校正 base noun phrase,chunk,boundary statistic,bunches of part of speech
  • 相关文献

参考文献6

  • 1Church. A stochastic parts program and noun phrase parser for unrestricted text[C].In:Proceedings of the Second Conference on Applied Natural Language Processing, 1988:136~143
  • 2Ramshaw,Marcus.Text chunking using transformation-based learning.In Natural language processing using very large corpora,Kluwer,Originally appeared in WVLC-95,1995:82~94
  • 3Cardie Claire,Pierce David.Error-driven pruning of treebank grammars for base noun phrase identification[C].In:Proceedings of COLINGACL'98,1998:218~224
  • 4赵军,黄昌宁.基于转换的汉语基本名词短语识别模型[J].中文信息学报,1999,13(2):1-7. 被引量:41
  • 5Xun EnDong. An Unified Statistical Model to Identify English Base NP[C].In:ACL-2000,The 38th Annual Meeting of the Computational Linguistics, HongKong, 2000-10
  • 6Guo Dong Zhou,Jian Su.Error-driven HMM-based Chunk Tagger with Context-dependent Lexicon. EMNLP/VLC-2000,Hong Kong,2000-10

二级参考文献3

  • 1张卫国.三种定语、三类意义及三个槽位[J].中国人民大学学报,1996,(4):97-100.
  • 2张卫国,中国人民大学学报,1996年,4期,97页
  • 3梅家驹,同义词词林,1983年

共引文献40

同被引文献15

  • 1梁颖红,赵铁军,岳琪.英语基本名词短语识别技术研究[J].信息技术,2004,28(12):22-24. 被引量:4
  • 2华沙宝,达胡白乙拉.对蒙古语语料库基本名词短语的定界与统计分析[J].中文信息学报,2005,19(5):52-58. 被引量:4
  • 3吕琳,刘玉树.最大熵和Brill方法结合识别英语BaseNPs[J].北京理工大学学报,2006,26(6):500-503. 被引量:6
  • 4ABNEY S.Parsing by chunks[M] //BERWICK P,ABNEY S,TENNY C.Principle-based parsing.Dordrecht:Kluwer Academic Publishers,1991:257-278.
  • 5CHRUCH K W.A stochastic parts program and noun phrase for unrestricted test[C] //Proc of the 2nd Conference on Applied Natural Language Processing.Morristown,NJ:Association for Computational Linguistics,1998:136-143.
  • 6周强.汉语短语的自动划分和标注[J].中文信息学报,1997,11(1):1-10. 被引量:21
  • 7RAMSHAW I. A, MARCUS M P. Text chunking using transfor marion-based learning: proceedings of WVLC 95 [C]. Hongkong: Hongkong Polytechnic University, 1995.
  • 8CLAIRE C, PIERCE D. Error-driven pruning of treebank gram mars for base noun phrase identification: proceedings of COLING ACL'98[C]. New York: Cornell University, 1998.
  • 9CHURCH K. A stochastic parts program and noun phrase parser for unrestricted text[C], proceedings of the second Conference on Applied Natural Language Processing, 1988,1988.
  • 10CLAIRE C, PIERCE D. The role of lexicalization and pruning for base noun phrase grammars[C]. Proceedings of the Sixteenth Na- tional Conference on Artificial Intelligence, 1999.

引证文献2

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部