期刊文献+

一种基于KL散度和类分离策略的特征选择算法 被引量:9

KL-divergence Based Feature Selection Algorithm with the Separate-class Strategy
下载PDF
导出
摘要 特征选择是模式识别和机器学习中的重要环节之一,所选特征子集的质量直接影响着分类学习算法的效率及准确率。现有特征选择算法均在整个类标签集的视角下进行特征评价,并未分别考察每一类别与特征间的关系。提出了一种基于KL散度和类分离策略的特征选择算法,它采用类分离策略分别对类标签中每一类别与特征间的关系予以考察,并采用一种基于KL散度的有效距离度量类别与特征间的相关性以及特征之间的冗余性。实验结果表明,所提算法具有较高的运行效率;在所选特征质量上,所提算法显著优于经典的CFS、FCBF以及ReliefF特征选择算法。 Feature selection is one of the core issues in designing pattern recognition systems and has attracted conside-rable attention in the literature.Most of the feature selection methods in the literature only handle relevance and redundancy analysis from the point of view of the whole class,which neglect the relation of features and the separate class labels.To this end,a novel KL-divergence based feature selection algorithm was proposed to explicitly handle the relevance and redundancy analysis for each class label with a separate-class strategy.A KL-divergence based metric of effective distance was also introduced in the algorithm to conduct the relevance and redundancy analysis.Experimental results show that the proposed algorithm is efficient and outperforms the three representative algorithms CFS,FCBF and ReliefF with respect to the quality of the selected feature subset.
出处 《计算机科学》 CSCD 北大核心 2012年第12期224-227,共4页 Computer Science
基金 国家自然科学基金项目(60973085)资助
关键词 特征选择 KL散度 类分离策略 有效距离 Feature selection KL-divergence Separate-class strategy Effective distance
  • 相关文献

参考文献15

  • 1Dash M, Liu H. Feature selection for classification [J]. Intelli- gent Data Analysis, 1997,1 (1-4) : 131-156.
  • 2Robnik-Sikonja M, Kononenko I. Theoretical and empirical analy sis of ReliefF and RReliefF[J]. Machine Learning, 2003,53 : 23- 69.
  • 3Liu H, Sun J, Liu L, et al. Feature selection with dynamic mutu al information [J]. Pattern Recognition, 2009,42 (7) : 1330-1339.
  • 4Jain A K,Duin R P W,Mao J. Statistical pattern recognition: A review [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000,22 ( 1 ) : 4-37.
  • 5任永功,林楠.DPFS:一种基于动态规划的文本特征选择算法[J].计算机科学,2009,36(6):188-191. 被引量:2
  • 6Hall M A. Correlation-based feature selection for discrete and numeric class machine learning [C]//Proceedings of the 7th In- ternational Conference on Machine Learning. San Francisco: Morgan Kaufmann, 2000 : 359-366.
  • 7Ding C, Peng H. Minimum Redundancy Feature Selection from Microarray Gene Expression Data [C] // Proceedings of the IEEE Computer Society Conference on Bioinformatics. Washing- ton IX;: IEEE Computer Society Press, 2003:523-528.
  • 8Peng H, Long F,Ding C. Feature selection based on mutual in- formation:criteria of max-dependency, max-relevance, and miw redundancy [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005,27(8) : 1226-1238.
  • 9Sotoca J M, Pla F. Supervised feature selection by clustering u- singconditional mutual information-based distances [J]. Pattern Recognition, 2010,43 (6) : 2068-2081.
  • 10Yu L, Liu H. Efficient feature selection via analysis of relevance and redundancy [J]. Journal of Machine Learning Research, 2004,5 : 1205-1224.

二级参考文献24

  • 1Aliferis C, Statnikov A, Tsamardinos I, et al. Local causal and Markov blanket induction for causal discovery and feature selection for classification, part I.. algorithrns and empirical evaluation [J]. Journal of Machine Learning Research, 2010,11:171-234.
  • 2Cover T, Thomas J. Elements of Information Theory [M]. New York: Wiley, 1991.
  • 3Faweett T. An introduction to ROC analysis [J]. Pattem Recognition Letters, 2006,27(8): 861-874.
  • 4Fayyad U, Irani K. Multi-interval discretization of continuous valued attributes for classification learning [C]//Proceedings of the 13th International Joint Conference on Artificial Intelligence (IJCAI' 93). San Francisco: Morgan Kaufmarm, 1993:1022-1027.
  • 5Fu S, Desmarais M. Fast Markov blanket discovery algorithm via local learning within single pass[J]. Lecture Notes in Artificial Intelligence, 2008,5032: 96-107.
  • 6Guyon I,Elisseeff A. An introduction to variable and feature selection [J]. Journal of Machine Learning Research, 2003, 3: 1157-1182.
  • 7John G, Langley P. Estimating continuous distributions in Bayesian classifiers [C] // Proceedings of the 11th Conference on Uncertainty in Artificial Intelligence ( UAI ' 95 ). San Mateo: Morgan Kaufmann, 1995 :338-345.
  • 8Liu H, Sun J, Liu L, et al. Feature selection with dy namic mutual information [J]. Pattern Recgonition, 2009,42(7) : 1330-1339.
  • 9Margaritis D, Thrun S. Bayesian network induction via local neighborhoods [R]. TR-CMU-CS-99-134. 1999.
  • 10Qu G, Hariri S, Yousif NL A new depeudency and correlation analysis for features [J]. IEEE Transactions on Knowledge and Data Engineering, 2005,17(9) : 1199-1207.

共引文献9

同被引文献75

引证文献9

二级引证文献27

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部