摘要
目前在文本分类领域较常用到的特征选择算法中,仅仅考虑了特征与类别之间的关联性,而对特征与特征之间的关联性没有予以足够的重视。在特征相关性分析的基础上,提出了一种新的算法,改进了特征选择算法中所出现的上述问题。实验验证了算法的可行性和有效性。
Current feature selection algorithms in text categorization are all based on the correlation between term and class, and neglect the correlation between terms. On analyzing the feature correlation, a new algorithm was put forward, which can solve the problem above. Simulation results demonstrated that the proposed method can improve the precision of text classification.
出处
《信息技术与信息化》
2009年第6期39-41,45,共4页
Information Technology and Informatization
关键词
特征选择
文本分类
文本集密度
Feature selection Text categorization Text set density