期刊文献+

不平衡数据集上的Relief特征选择算法 被引量:15

Relief Feature Selection Algorithm on Unbalanced Datasets
下载PDF
导出
摘要 Relief算法为系列特征选择方法,包括最早提出的Relief算法和后来拓展的ReliefF算法,核心思想是对分类贡献大的特征赋予较大的权值;特点是算法简单,运行效率高,因此有着广泛的应用。但直接将Relief算法应用于有干扰的数据集或不平衡数据集,效果并不理想。基于Relief算法,提出一种干扰数据特征选择算法,称为阈值-Relief算法,有效消除了干扰数据对分类结果的影响。结合K-means算法,提出两种不平衡数据集特征选择算法,分别称为K-means-ReliefF算法和K-means-Relief抽样算法,有效弥补了Relief算法在不平衡数据集上表现出的不足。实验证明了本文算法的有效性。 Relief algorithm is a series of feature selection method. It includes the basic principle of Relief algorithm and its later extensions reliefF algotithm. Its core concept is to weight more on features that have essential contributions to classification. Relief algorithm is simple and efficient, thus being widely used. However, algorithm performance is not satisfied when applying the algorithm to noisy and unbal- anced datasets. In this paper, based on the Relief algorithm, a feature selection method is proposed, called threshold-Relief algorithm, which eliminates the influence of noisy data on classification results. Combining with the K-means algorithm, two unbalanced datasets feature selection methods are pro- posed, called K-means-ReliefF algorithm and K-means-relief sampling algorithm, respectively, which can compensate for the poor performance of Relief algorithm in unbalanced datasets. Experiments show the effectiveness of the proposed algorithms.
出处 《数据采集与处理》 CSCD 北大核心 2016年第4期838-844,共7页 Journal of Data Acquisition and Processing
基金 国家自然科学基金(61273294)资助项目 山西省科技基础条件平台(2014091004-0104)资助项目
关键词 特征选择 RELIEF算法 RELIEFF算法 不平衡数据集 feature selection Relief algorithm ReliefF algorithm unbalanced datasets
  • 相关文献

参考文献5

二级参考文献50

共引文献99

同被引文献149

引证文献15

二级引证文献155

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部