期刊文献+

一种基于类差分度的互信息特征选择方法 被引量:2

Improved mutual information based on difference factor among classes
下载PDF
导出
摘要 通过引入类差分度,提出一种改进的互信息特征选择方法,并同时引入相对词频因子解决传统方法倾向于选择低频词的不足,合理地改善了特征选择的准确率,提高分类的精度和效率。文本分类实验结果表明,所提出方法的平均查全率和平均查准率分别提高了11.26%和8.04%,综合评价指标平均F1值提高了18.55%。 An improved mutual information feature selection method is proposed by introducing difference degree among classes.Meanwhile,relatively term frequency factor is applied to solve the traditional methods tend to choose low-frequency words.This method could improve the accuracy of feature selection,and increase the accuracy and efficiency of classification.Text classification experimental results show that the average recall rate and precision rate of the proposed method increase by 11.26% and8.04%,the average F1 increases by 18.55%.
出处 《中国科技论文》 CAS 北大核心 2015年第20期2386-2389,共4页 China Sciencepaper
关键词 计算机应用 特征选择 互信息 相对词频因子 类差分度 computer application feature selection mutual information relative term frequency factor difference factor among classes
  • 相关文献

参考文献15

  • 1DESTRERO A, MOSCI S, de MOL C, et al. Feature selection for high-dimensional data [J]. Computational Management Science, 2009, 6(1): 25-40.
  • 2YAN Jun, LIU Ning, YAN Shuicheng, et al. Trace-o- riented feature analysis for large-scale text data dimen- sion reduction[J]. IEEE Transactions on Knowledge and Data Engineering, 2011, 23(7) : 1103 -1117.
  • 3ZHAO Zheng, WANG Lei, LIU Huan, et al. On simi- larity preserving feature selection [J]. IEEE Transac- tions on Knowledge and Data Engineering, 2013, 25 (3); 619- 632.
  • 4YOU Mingyu, LIU Jiaming, LI Guozheng, et al. Em- bedded feature selection for multi-label classification of music emotions[J]- International Journal of Computa- tional Intelligence Systems, 2012, 5(4): 668-678.
  • 5HARUN U. A two-stage feature selection method for text categorization by using information gain, principal component analysis and genetic algorithm [J]. Knowl- edge-Based Systems, 2011, 24(7): 1024-1032.
  • 6AMIRI F, REZAEI M. Mutual information-based fea- ture selection for intrusion detection systems[J]. Jour- nal of Network and Computer Applications, 2011, 34 (4) : 1184-1199.
  • 7FORMAN G. An extensive empirical study of feature selection metrics for text classification[J]. Journal of Machine Learning Research, 2003, 3(1): 1289-1305.
  • 8YANG Y, PEDERSEN J O. A comparative study on feature selection in text categorization [C] // Proceed- ings of the 14th International Conference on Machine Learning. Nashville: Morgan Kaufmann, 1997 : 412-420.
  • 9杨杰明,王静,曲朝阳.基于相对贡献率的特征选择方法[J].东北电力大学学报,2014,34(4):62-68. 被引量:5
  • 10BAKUS J, KAMEL M S. Higher order feature selec- tion for text classification [J]. Knowledge and Informa tion Systems, 2006, 9(4): 468-491.

二级参考文献43

共引文献37

同被引文献17

引证文献2

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部