摘要
在文本分类系统中,特征选择方法是一种有效的降维方法。在分析了几种常用的特征选择评价函数之后,基于互信息的算法特点,本文提出了一种基于扩展互信息算法的特征选择方法(EMI)。实验结果证明该方法简单可行,有助于提高所选特征子集的有效性。
Feather selecttion is a valid method to reduce the dimension of vector in text categorization system.After analysing some evaluation functions in common used for feature selection,a new feature selection method called extended Mutual Information(EMI) based on the characteristic of mutual information algorithm is presented.The ressult of experiments proved that the method is simple and feasible and is helpful to improve the eficiency of the selected feature subset.
出处
《微计算机信息》
2010年第24期223-224,219,共3页
Control & Automation
基金
河北省张家口市科技局项目(0921045B)
关键词
文本分类
特征选择
评价函数
互信息
text categorization
feature selection
evaluation function
mutual information