期刊文献+

基于混合统计模型的汉语命名实体识别方法 被引量:20

A Mixed Statistical Model-Based Method for Chinese Named Entity Recognition
下载PDF
导出
摘要 本文针对三种重要的命名实体,即人名、地名、组织名,提出了一种隐马尔可夫模型(HMM)和最大熵模型(ME)相结合的汉语命名实体识别的方法。该方法的特点在于使命名实体识别和词性标注两个任务一体化;融合两种统计模型进行命名实体识别,其中HMM从整体上(句子范围内)对命名实体识别进行约束,ME则在局部范围内(当前词的上下文范围)估计一个词串被标记为某种命名实体的概率。实验表明,这种方法能较好地识别上述三种命名实体。 This paper presents a method for Chinese Named Entity (NE) recognition using a mixed statistical model. Our NE recognition concentrates on three types of NEs personal names, location names and organization names. This method is characterized as the following two aspects. At first, it provides a unified framework tO incorporate NE recognition and Part-of-Speech lagging together. Secondly, it makes use of two statistical models, taking HMM to contrain the recogni tion in the scope of a sentence, taking ME to calculate the probability of the entity in the context. Experimental results show that the method can effectively recognize the above-mentioned three named entities.
出处 《计算机工程与科学》 CSCD 2006年第6期135-139,共5页 Computer Engineering & Science
基金 国家自然科学基金资助项目(60403050)
关键词 命名实体识别 隐马尔可夫模型 最大熵模型 named entity recognition Hidden Markov Model (HMM) maximum entropy model (ME)
  • 相关文献

参考文献13

  • 1Beth M Sundheim.Named Entity Task Definition.Version 2.1[A].Proc of the 6th Message Understanding Conf[C].1995.319-332.
  • 2H H Chen,Y W Ding,S C Tsai.et al.Description of the NTU System Used for MET2[A].Proc of 7th Message Understanding Conf[C].1998.
  • 3W J Black,F Rinaldi,D Mowatt.Facile:Description of the NE System Used For MUC-7[A].Proc of 7th Message Understanding Conf[C].1998.
  • 4J Fukumoto,M Shimohata,F Masui,et al.Oki Electric Industry:Description of the Oki System as Used for MET-2[A].Proc of 7th Message Understanding Conf[C].1998.
  • 5GuoDong Zhou,Jian Su.Named Entity Recognition Using an HMM-Based Chunk Tagger[A].Proc of the 40th Annual Meeting of the ACL[C].2002.473-480.
  • 6Adwait Ratnaparkhi.A Simple Introduction to Maximum Entropy Models for Natural Language Processing[R].Technical Report 97-08,Institute for Research in Cognitive Science,University of Pennsylvania,1997.
  • 7S Sekine,R Grishman,H Shinou.A Decision Tree Method for Finding and Classifying Names in Japanese Texts[A].Proc of the 6th Workshop on Very Large Corpora[C].1998.
  • 8E Brill.Transform-Based Error-Driven Learning and Natural Language Processing:A Case Study in Part-of-Speech Tagging[J].Computational Linguistics,1995,21(4):543-565.
  • 9M Collins.Ranking Algorithms for Named-Entity Extraction:Boosting and the Voted Perception[A].Proc of the 40th Annual Meeting of the ACL[C].2002.489-496.
  • 10M Jansche.Named Entity Extraction with Conditional Markov Models and Classifiers[A].The 6th Conf on Natural Language Learning[C].2002.

二级参考文献22

  • 1E F T K Sang, W Daelemans, H Déjean et al. Applying system combination to base noun phrase identification. In: Proc of COLING 2000. Saarbrücken, Germany: Morgan Kaufmann Publishers, 2000. 857~863
  • 2周明 .基于语料库的中文最长名词短语的自动抽取.见:计算语言进展与应用.北京,清华大学出版社,1995. 50-55(Zhou Ming. Corpus-based Chinese maximum noun phrase extraction. In: Computer Linguistic Development and Application(in Chinese). Beijing: Tsinghua University Press, 1995. 50-55)
  • 3K W Church. A stochastic parts program and noun phrase for unrestricted test. In: Proc of the 2nd Conf on Applied Natural Language Processing. Austin, TX, USA: Kluwer Academic Publishers, 1988. 136~143
  • 4S P Abney. Parsing by Chunks. In: R C Berwick, S P Abney eds. PrincipleBased Parsing: Computation and Psycholinguistics. Boston, USA: Kluwer Academic Publishers, 1991. 257~278
  • 5L A Ramshaw, M P Marcus. Text chunking using transformation-based learning. In: Proc of the 3rd Workshop on Very Large Corpora. Kluwer Academic Publishers, 1995. 82~94
  • 6A Ratnaparkhi. Learning to parse natural language with maximum entropy models. Machine Learning, 1999, 34(1/2/3): 151~176
  • 7范晓.静态短语和动态短语. 见:三个平面的语法观 .北京:北京语言文化大学出版社,1996(Fan Xiao. Static phrase and dynamic phrase. In: Grammar Concept from Three Sides(in Chinese). Beijing: Beijing Linguistic Culture College Publisher, 1996)
  • 8R Koeling. Chunking with maximum entropy models. In: Proc of CoNLL 2000. Lisbon, Portagal: Lingustic Association for Computation, 2000
  • 9A L Berger, S A D Pietra, V J D Pietra. A maximum entropy approach to natural language processing. Computational Linguistics, 1996, 22(1):39~71
  • 10A L Berger. The improved iterative scaling algorithm: A gentle introduction. School of Computer Science, Carnegin Mellon University, 1997

共引文献61

同被引文献223

引证文献20

二级引证文献313

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部