期刊文献+

基于正则化互信息和差异度的集成特征选择 被引量:3

Ensemble Feature Selection Based on Normalized Mutual Information and Diversity
下载PDF
导出
摘要 如何构造差异性大的基分类器是集成学习研究的重点,为此提出迭代循环选择法:以最大化正则互信息为准则提取最优特征子集,进而基于此训练得到基分类器;同时以错分样本个数作为差异性度量准则来评价所得基分类器的性能,若满足条件则停止,反之则循环迭代直至结束。最后用加权投票法融合所选基分类器的识别结果。通过仿真实验验证算法的有效性,以支持向量机为分类器,在公共数据集UCI上进行实验,并与单SVM及经典的Bagging集成算法和特征Bagging集成算法进行对比。实验结果显示,该方法可获得较高的分类精度。 How to generate classifiers with higher diversity is an important problem in ensemble learning, consequently, an iterative algorithm was proposed as follows:base classifier is trained using optimal feature subset which is selected by maximum normalized mutual information, simultaneously, the attained base classifier is measured by the diversity based on the number of miss classified samples. The algorithm stops if satisfy, otherwise iterates until end. Finally, weighted voting method is utilized to fusion the base classifiers recognition results. To attest the validity, we made ex- periments on UCI data sets with support vector machine as the classifier, and compared it with Single-SVM, Bagging- SVM and AB-SVM. Experimental results suggest that our algorithm can get higher classification accuracy.
出处 《计算机科学》 CSCD 北大核心 2013年第6期225-228,共4页 Computer Science
基金 国家自然科学基金项目(60975026,61273275)资助
关键词 集成学习 集成特征选择 互信息 差异性 Ensemble learning, Ensemble feature selection, Mutual Information, Diversity
  • 相关文献

参考文献19

  • 1Opitz D.Feature selection for Ensembles[C]// Proceedings of American Association for Artificial Intelligence.1999:379-384.
  • 2Ho TK.The random subspace method for constructing derision forests[J].IEEE Transaction on Pattern Analysis and Machine Intelligence,1998,20(8):832-844.
  • 3Brylla R,Osunab R G,Queka F.Attribute Bagging:Improving accuracy of classifier ensembles by using random feature subsets[J].Pattern Recognition,2003,36 (6):1291-1302.
  • 4Oliveira L S,Morita M,Sabourin R.Multi-Objective Genetic Al-gorithm Create Ensemble of Classifiers[C]// Pros OFEMO 2005.Guanajuato,Mexico,2005:592-606.
  • 5李霞,王连喜,蒋盛益.面向不平衡问题的集成特征选择[J].山东大学学报(工学版),2011,41(3):7-11. 被引量:5
  • 6孙亮,韩崇昭,沈建京,戴宁.集成特征选择的广义粗集方法与多分类器融合[J].自动化学报,2008,34(3):298-304. 被引量:10
  • 7张宏达,王晓丹,韩钧,徐海龙.分类器集成差异性研究[J].系统工程与电子技术,2009,31(12):3007-3012. 被引量:9
  • 8Dietterich T G.Ensemble methods in machine learning[C]//Proc.The 1st Int ' 1 Workshop on Multiple Classifier Systems (MCS 2000).Italy,LNCS,Springer,2000:1-15.
  • 9Kuncheva L I,Skurichina M,Duin R P W.An experimental study on diversity for bagging and boosting with linear classifiers[J].Information Fusion,2002,3:245-258.
  • 10Dietterich T G.An experimental comparison of three methods for constructing ensembles of decision trees:bagging,boosting,and randomization[J].Machine Learning,2000,40:139-158.

二级参考文献49

  • 1李凯,黄厚宽.一种提高神经网络集成差异性的学习方法[J].电子学报,2005,33(8):1387-1390. 被引量:9
  • 2Sun Liang,Han Chongzhao.Dynamic weighted voting for multiple classifier fusion:a generalized rough set method[J].Journal of Systems Engineering and Electronics,2006,17(3):487-494. 被引量:9
  • 3肖迪,胡寿松.实域粗糙集理论及属性约简[J].自动化学报,2007,33(3):253-258. 被引量:32
  • 4Hansen L K, Salamon P. Neural network ensembles[J]. IEEE Trans. on Pattern AnaLysis and Machine Intelligence, 1990, 12(10) :993 - 1001.
  • 5Dietterich T G. Machine learning research: four current directions[J]. AI Magazine, 1997,18(4) : 97 - 136.
  • 6Dietterich T G. An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization[J]. Machine Learning, 2000,40(2) : 139 - 158.
  • 7Kohavi R, Wolpert D H. Bias plus variance decomposition for zero-one loss functions[C]//Saitta L. Machine learning: Proc. of the 13th International Conference, Morgan Kau f mann , Los Altos, CA, 1996:275 - 283.
  • 8Shipp C A, Kuncheva L I. Relationships between combination methods and measures of diversity in combining classifiers[J].Information Fusion ,2002,3(2) : 135 - 148.
  • 9Chandra A, Yao X. DIVACE: diverse and accurate ensemble learning algorithm[M]. Yang Z R, et al. IDEAL, LNCS 3177, Heidelberg: Springer, 2004 : 619 - 625.
  • 10Melville P, Mooney R J. Creating diversity in ensembles using artificial data[J].Information Fusion, 2005,6(1) : 99 - 111.

共引文献41

同被引文献30

  • 1Opitz D W. Feature selection for ensembles[C]//Proceedings of 16th National Conference on Artificial Intelligence (AAAI-99). Orlando, FL, USA, 1999 : 379-384.
  • 2Ho T K. The random subspace method for constructing decision forests[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998,20(8) :832-844.
  • 3Breiman L. Random forests[J] Machine learning, 2001,45 ( 1 ) : 5-32.
  • 4De Bock K W,Coussement K,Van den Poel D. Ensemble classi- fication based on generalized additive models[J]. Computational Statistics :s. Data Analysis,2010,54(6):1535-1546.
  • 5Moon H, Ahn H, Kodell RL, et al. Ensemble methods for clast fication of patients for personalized medicine with high-dime sional data[J]. Artificial Intelligence in Medicine, 2007,41 (3) : 197-207.
  • 6Liu Hua-wen, Liu Lei, Zhang Hui-jie. Ensemble gene selection by grouping for microarray data classification[J]. Journal of Bio- medical Informatics, 2010,43(1) : 81-87.
  • 7Wald R, Khoshgoftaar T M, Dittman D. Mean aggregation ver- sus robust rank aggregation for ensemble gene selection[C]// 2012 llth International Conference on Machine Learning and Applications (ICMLA). Boca Raton, FL, USA, 2012 : 63-69.
  • 8Lin Song, I.angfelder P, Horvath S. Comparison of co-expression measures : mutual information, correlation, and model based indi- ces[J]. BMC Bioinformatics,2012,13(1):328.
  • 9Frey B J,Dueck D. Clustering by passing messages between data points[J]. Science,2007,315(5814) :972-976.
  • 10Boulesteix A L, Slawski M. Stability and aggregation of ranked gene lists[J]. Briefings in Bioinformaties, 2009,10(5) : 556-568.

引证文献3

二级引证文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部