摘要
本文针对三种重要的命名实体,即人名、地名、组织名,提出了一种隐马尔可夫模型(HMM)和最大熵模型(ME)相结合的汉语命名实体识别的方法。该方法的特点在于使命名实体识别和词性标注两个任务一体化;融合两种统计模型进行命名实体识别,其中HMM从整体上(句子范围内)对命名实体识别进行约束,ME则在局部范围内(当前词的上下文范围)估计一个词串被标记为某种命名实体的概率。实验表明,这种方法能较好地识别上述三种命名实体。
This paper presents a method for Chinese Named Entity (NE) recognition using a mixed statistical model. Our NE recognition concentrates on three types of NEs personal names, location names and organization names. This method is characterized as the following two aspects. At first, it provides a unified framework tO incorporate NE recognition and Part-of-Speech lagging together. Secondly, it makes use of two statistical models, taking HMM to contrain the recogni tion in the scope of a sentence, taking ME to calculate the probability of the entity in the context. Experimental results show that the method can effectively recognize the above-mentioned three named entities.
出处
《计算机工程与科学》
CSCD
2006年第6期135-139,共5页
Computer Engineering & Science
基金
国家自然科学基金资助项目(60403050)
关键词
命名实体识别
隐马尔可夫模型
最大熵模型
named entity recognition
Hidden Markov Model (HMM)
maximum entropy model (ME)