期刊文献+

基于持续同调的过滤式特征选择算法 被引量:1

Filtered Feature Selection Algorithm Based on Persistent Homology
下载PDF
导出
摘要 现有的过滤式特征选择算法忽略了特征之间的关联性。鉴于此,提出了一种新的过滤式特征选择算法——基于持续同调的特征选择算法(Rel-Betti算法),该算法能够识别特征之间的关联性以及组合效果。通过提出相关贝蒂数概念,筛选出数据集中重要的拓扑特征信息。该算法对数据集进行预处理后,根据类标签将数据集分类,计算不同类中的相关贝蒂数,获得数据信息的特征均值,按特征均值差值大小对特征进行重要性排序。利用UCI数据集中的8个数据,将该算法与其他常见算法在决策树、随机森林、K近邻和支持向量机这4种学习模型下进行比较实验。结果表明,该算法是一种有效的特征选择算法,其能够提高分类的准确率和F1值,并且不依赖于特定的机器学习模型。 The existing filtering feature selection algorithm ignores the correlation between features.This paper proposes a new filtering feature selection algorithm—the feature selection algorithm based on persistent homology(Rel-Betti algorithm),which can consider the correlation between features and the combined effect.This paper gives a new definition named by relevant Betti numbers,which can filter out the important topological features in the dataset.The algorithm first preprocesses the data set,classifies the data set according to the class labels,calculates the relevant Betti numbers in different classes,obtains the feature mean of the data information,and uses the feature mean difference to sort the importance of the features.Using eight data in UCI,the algorithm is compared with other common algorithms under four learning models:decision tree,random forest,K-nearest neighbor and support vector machine.Experimental results show that the Rel-Betti algorithm is an effective method that can improve classification accuracy and F1 value,and does not depend on a specific machine learning model.
作者 殷杏子 彭宁宁 詹学燕 YIN Xingzi;PENG Ningning;ZHAN Xueyan(School of Science,Wuhan University of Technology,Wuhan 430070,China)
出处 《计算机科学》 CSCD 北大核心 2023年第6期159-166,共8页 Computer Science
基金 国家自然科学基金(11701438)。
关键词 特征选择 持续同调 条形码 贝蒂数 机器学习 Feature selection Persistent homology Barcode Betti number Machine learning
  • 相关文献

参考文献6

二级参考文献53

  • 1崔文岩,孟相如,李纪真,王明鸣,陈天平,王坤.基于粗糙集粒子群支持向量机的特征选择方法[J].微电子学与计算机,2015,32(1):120-123. 被引量:9
  • 2毛勇,周晓波,夏铮,尹征,孙优贤.特征选择算法研究综述[J].模式识别与人工智能,2007,20(2):211-218. 被引量:95
  • 3何爱香.基于遗传算法的结肠癌基因选择与样本分类[J].计算机工程与应用,2007,43(18):242-245. 被引量:2
  • 4Guyon I, Elisseeff A.An introduction to variable and feature selection[J].Mach Learn Res, 2003,3 : 1157-1182.
  • 5Van Dijck C.Speeding up the wrapper feature subset selection in regression by mutual information relevance and redundancy analysis[C]//LNCS4131 : ICANN 2006, Partl, 2006: 31-40.
  • 6Guyon I,Gunn S,Nikravesh M,et al.Feature extraction,foundations and applications[M].Heidelberg:Springer,2006.
  • 7Wang C M, Huang Y F.Evolutionary-based feature selecton ap- proaches with new criteria for data minning: A case study of credit approval data[J].Expert Systems with Applications, 2009, 36: 5900-5908.
  • 8Bins J0 Draper B.Feature selection from huge feature sets[C]// Int Conf Comput Vis,Vancouver,BC,Canda,Jul 2001 : 159-165.
  • 9Zhou Xiaobo,Wang Xiaodong, Dougherty E R.Nonlinear probit gene classification using mutual-information and wavelet-based feature selection[J].Biological Systems,2004,12(3):371-386.
  • 10Estevez P A.Normalized mutual information feature selection[J]. IEEE Transactons on Neural Networks,2009(20).

共引文献51

同被引文献4

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部