摘要
先分析了最大熵模型常用的特征线性组合方法中的权值偏置问题,然后提出了在线性组合之前,对特征进行融合,并根据融合特征和目标类别之间的互信息选择有效复合特征的方法。通过在包含2000个人名的语料库上的测试,表明特征融合能有效地提高名实体识别的精度和召回率。
Maximum entropy model is usually used for named entity recognition, in which the features related to a random event are linearly combined. The problem of the weight bias in the features combination was pointed out, and a strategy of performing features fusion before linearly combining was proposed. The result of experiment on corpus containing 2000 human names shows that features fusion can improve the precision and recall of named entity recognition effectively.
出处
《计算机应用》
CSCD
北大核心
2005年第11期2647-2649,共3页
journal of Computer Applications
基金
国家自然科学基金资助项目(60435020)
国家863计划项目(2002AA117010-09)
关键词
名实体识别
特征组合
权值偏置
特征融合
最大熵模型
named entity recognition( NER)
features combination
weight bias
fcatures fusion
maximum entropy model