期刊文献+

采用快速迁移模型的集成特征选择方法

Ensemble Feature Selection Method with Fast Transfer Model
下载PDF
导出
摘要 相较于传统集成特征选择方法,目前的基于块正则化m×2交叉验证的集成特征选择方法(EFSBCV)不仅具有估计量的方差小于随机m×2交叉验证的方差之特点,而且提高了重要特征的入选概率,降低了噪声特征的入选概率。但EFSBCV所采用的线性回归模型因只有误差项而不包含偏置项,故拟合出来的超平面总是过原点的,因而很容易导致欠拟合,而且EFSBCV没有考虑每个特征子集的重要程度。针对EFSBCV方法存在的这两点问题,提出了基于快速迁移模型的集成特征选择方法(EFSFT)。基本思想是EFSBCV中的基特征选择器采用提出的快速迁移模型,从而引入了偏置项,EFSFT将2m个特征子集作为源知识进行迁移,然后重新量化每个特征子集的权重,加入偏置项的线性模型拟合能力更好。真实数据实验表明,EFSFT相对于EFSBCV,FP平均值降低了58%,证明EFSFT在去除噪声特征方面更具优势。EFSFT相对于最小二乘支持向量机(LSSVM),TP平均值提高了5%,证明EFSFT在筛选重要特征方面更具优势。 Compared with the traditional ensemble feature selection methods,the recently-developed ensemble fea-ture selection with block-regularized m×2 cross-validation(EFSBCV)not only has a variance of the estimator smaller than that of random m×2 cross-validation,but also enhances the selection probability of important features and reduces the selection probability of noise features.However,the adopted linear regression model without the use of the bias term in EFSBCV may easily lead to underfitting.Moreover,EFSBCV does not consider the impor-tance of each feature subset.Aiming at these two problems,an ensemble feature selection method called EFSFT(en-semble feature selection method using fast transfer model)is proposed in this paper.The basic idea is that the base feature selector in EFSBCV adopts the fast transfer model in this paper,so as to introduce the bias term.EFSFT transfers 2m subsets of features as the source knowledge,and then recalculates the weight of each feature subset,and the linear model fitting ability with the addition of bias terms is better.The results on real datasets show that compared with EFSBCV,the average FP value by EFSFT reduces up to 58%,proving that EFSFT has more advan-tages in removing noise features.In contrast to least-squares support vector machine(LSSVM),the average TP value by EFSFT increases up to 5%,which clearly indicates the superiority of EFSFT over LSSVM in choosing important features.
作者 宁保斌 王士同 NING Baobin;WANG Shitong(School of Artificial Intelligence and Computer Science,Jiangnan University,Wuxi,Jiangsu 214122,China)
出处 《计算机科学与探索》 CSCD 北大核心 2024年第2期496-505,共10页 Journal of Frontiers of Computer Science and Technology
基金 江苏省自然科学基金(BK20191331)。
关键词 集成特征选择 交叉验证 迁移学习 回归 ensemble feature selection cross-validation transfer learning regression
  • 相关文献

参考文献5

二级参考文献20

共引文献50

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部