期刊文献+

基于变异系数和最大特征树的特征选择方法 被引量:5

Feature Selection Method Based on Coefficient of Variation and Maximum Feature Tree
下载PDF
导出
摘要 特征选择是数据挖掘的关键过程,特征贡献度评分和特征优选是其核心部分.针对特征贡献度评分,提出一种用变异系数度量类内距离、互信息度量类间距离的CVMI(coefficient of variation and mutual of information)方法,将该算法运用到嵌入式特征选择方法中进行特征优选.实验采用UCI提供的4组数据集、1组遥感数据和1组鸟鸣声数据,使用7种特征贡献度评分方法进行对比.结果表明,CVMI方法更符合特征贡献度评价的客观规律,对比其他7种方法,CVMI方法取得较好效果.此外,基于CVMI特征评分方法构建最大特征树,结合二邻域去冗余的特征优选方法CVMI-RRMFT(remove redundancy of maximum feature tree),采用上述数据集进行实验,结果表明该方法不仅能有效降低数据维度,而且还能提高分类准确率. Feature selection is a key process in data mining.Feature contribution scoring and feature optimization are its core parts.This paper proposed a CVMI(coefficient of variation and mutual of information)method that used the coefficient of variation to measure the distance between intraclass and the mutual information to measure the distance between interclass,and then applied the algorithm to the embedded feature selection method.The experiment used four UCI data sets,one set of remote sensing data and birds sound data,and tested seven different feature contribution scoring methods.The results showed that the CVMI method was more in line with the objective law of feature contribution evaluation.It also achieved better results compared to the other feature scoring methods.Besides,this paper also proposed a feature optimization method CVMI-RRMFT(remove redundancy of maximum feature tree)based on CVMI to construct a maximum feature tree and remove redundancy with two-neighborhood.Experiment results demonstrated that this feature optimization method effectively reduced data dimensions and improved the classification accuracy.
作者 徐海峰 张雁 刘江 吕丹桔 Xu Haifeng;Zhang Yan;Liu Jiang;LüDanjv(School of Big Data and Intelligent Engineering,Southwest Forestry University,Kunming 650224,China)
出处 《南京师大学报(自然科学版)》 CAS CSCD 北大核心 2021年第1期111-118,共8页 Journal of Nanjing Normal University(Natural Science Edition)
基金 国家自然科学基金资助项目(61462078,31860332) 云南省教育厅科学研究基金资助性项目(2017ZZX212).
关键词 特征选择 特征贡献度 变异系数 互信息 最大特征树 二邻域去冗余 feature selection feature contribution scoring coefficient of variation mutual information maximum feature tree remove redundancy with two-neighborhood
  • 相关文献

参考文献5

二级参考文献30

共引文献186

同被引文献75

引证文献5

二级引证文献24

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部