期刊文献+

基于特征聚类的封装特征选择算法 被引量:3

Algorithm for wrapper feature selection based on feature clustering
下载PDF
导出
摘要 针对多维数据集,为得到一个最优特征子集,提出一种基于特征聚类的封装式特征选择算法。在初始阶段,利用三支决策理论动态地将原始特征集划分为若干特征子空间,通过特征聚类算法对每个特征子空间内的特征进行聚类;从每个特征类簇里挑选代表特征,利用邻域互信息对剩余特征进行降序排序并依次迭代选择,使用封装器评估该特征是否应该被选择,可得到一个具有最低分类错误率的最优特征子集。在UCI数据集上的实验结果表明,相较于其它特征选择算法,该算法能有效地提高各数据集在libSVM、J48、Nave Bayes以及KNN分类器上的分类准确率。 To obtain an optimal feature subset of multi-dimensional data,a feature selection algorithm based on feature clustering and wrapper(FC_ W) was proposed.In the initial stage,the original feature set was divided into a number of feature subspaces using the three-way decision theory,and the features of each feature subspace were clustered using the feature clustering algorithm.The representative features were selected from each feature cluster,and the remaining features were sorted in descending order and iteratively selected using the neighborhood mutual information(NMI) between them.In this selection process,a wrapper was utilized to evaluate whether the selected feature should be selected or not.An optimal feature subset with a minimum classification error rate was obtained.Experimental evaluation on UCI data sets shows that,compared with the feature selection algorithms in other literatures,the proposed algorithm has higher classification accuracy in libSVM,J48,Nave Bayes and KNN classifiers.
作者 胡峰 杨梦
出处 《计算机工程与设计》 北大核心 2018年第1期230-237,共8页 Computer Engineering and Design
基金 国家自然科学基金项目(61309014) 教育部人文社科规划基金项目(15XJA630003) 重庆市基础与前沿研究计划基金项目(cstc2013jcyj A40063) 重庆市教委科学技术研究基金项目(KJ1400412) 重庆市教委科学技术研究基金项目(KJ1500416)
关键词 特征选择 特征聚类 封装器 邻域互信息 三支决策 feature selection feature clustering wrapper neighborhood mutual information(NMI) three-way decision
  • 相关文献

参考文献5

二级参考文献92

  • 1余丹.关于查全率和查准率的新认识[J].西南民族大学学报(人文社会科学版),2009,30(2):283-285. 被引量:15
  • 2赵文清,朱永利,高伟华.一个基于决策粗糙集理论的信息过滤模型[J].计算机工程与应用,2007,43(7):185-187. 被引量:15
  • 3Li G-Z, Yang J Y. Feature selection for ensemble learning and its application[M]. Machine Learning in Bioinformatics, 2008: 135-155.
  • 4Sheinvald J, Byron Dom, Wayne Niblack. A modelling approach to feature selection[J]. Proc of 10th Int Conf on Pattern Recognition, 1990, 6(1): 535-539.
  • 5Cardie C. Using decision trees to improve case-based learning[C]. Proc of 10th Int Conf on Machine Learning. Amherst, 1993: 25-32.
  • 6Modrzejewski M. Feature selection using rough sets theory[C]. Proc of the European Conf on Machine ,Learning. 1993: 213-226.
  • 7Ding C, Peng H. Minimum redundancy feature selection from microarray gene expression data[J]. J of Bioinformatics and Computational Biology, 2005, 3(2): 185-205.
  • 8Francois Fleuret. Fast binary feature selection with conditional mutual information[J]. J of Machine Learning Research, 2004, 5(10): 1531-1555.
  • 9Kwak N, Choi C-H. Input feature selection by mutual information based on Parzen window[J]. IEEE Trans on Pattern Analysis and Machine Intelligence, 2002, 24(12): 1667-1671.
  • 10Novovicova J, Petr S, Michal H, et al. Conditional mutual information based feature selection for classification task[C]. Proc of the 12th Iberoamericann Congress on Pattern Recognition. Valparaiso, 2007: 417-426.

共引文献365

同被引文献29

引证文献3

二级引证文献31

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部