摘要
提出一种基于最大熵模型的中文疾病命名短语识别方法,在模型特征选择上,将领域本体信息作为模型的一种特征.由此实现的疾病命名短语识别分类器具备有监督学习和利用领域知识的能力.实验结果表明,对于疾病命名短语识别的准确率达到89.7%,召回率87.6%,F-评价值88.64%.
A method of recognizing disease named phrase in Chinese is proposed, based on a maximum entropy model. In the feature selection, domain ontology information is utilized as a kind of feature. With the suggested method, the disease named phrase recognition classifier has supervised learning ability and the ability of assimilating and utilizing domain knowledge. Experimental results showed a precision of recognition for disease named phrase at 89.7%, a recall of 87.6% and a F-measure of 88.64 %.
出处
《北京理工大学学报》
EI
CAS
CSCD
北大核心
2006年第6期517-520,共4页
Transactions of Beijing Institute of Technology
基金
教育部博士学科点专项科研基金资助课题(20050007023)
关键词
最大熵模型
特征选择
本体
疾病命名短语识别
maximum entropy model
feature selection
ontology
disease named phrase recognition