摘要
针对当前特征选择方法依然受噪声影响以及无法将聚类效果和重构效果有效统一的问题,提出了一种稳健的特征选择方法。从干净数据和重构数据作差的思路着手,将低秩重构数据和投影重构数据作差构建稳健的重构误差项,并提出从学习到的干净数据上选择特征用于聚类。将干净数据的学习和特征选择技能进行联合学习,相互促进,从而提升方法在有噪数据上的稳健性,并且将重构效果和聚类效果进行有效统一。在5个数据集上与几种图嵌入角度的特征选择以及PCA重构角度的特征选择方法进行聚类实验对比,实验结果表明,除LUNG噪声数据集外,所提方法在2种评价指标(ACC和NMI)下都优于对比特征选择方法。
Aiming at the problem that current feature selection methods were still affected by noise and cannot effectively unify clustering and reconstruction effects,a robust feature selection method was proposed.A robust reconstruction error term was built by making the difference between low-rank reconstruction and projection reconstruction.After that,the features for clustering were selected from the reconstructed data instead of the original data.The learning of clean data and feature selection technique are allowed for joint learning and promote each other,thereby improving the robustness of the method on noisy data,and effectively unifying reconstruction and clustering.Compared with several kinds of graph embedding feature selection and reconstruction feature selection methods on five datasets,the experimental results showed that,except for the LUNG noise dataset,the proposed method outperforms the comparative feature selection method under both evaluation indicators(ACC and NMI).
作者
仪双燕
梁永生
陆晶晶
柳伟
胡涛
何震宇
YI Shuangyan;LIANG Yongsheng;LU Jingjing;LIU Wei;HU Tao;HE Zhenyu(School of Software Engineering,Shenzhen Institute of Information Technology,Shenzhen 518000,China;School of Electronics and Information Engineering,Harbin Institute of Technology(Shenzhen),Shenzhen 518000,China;Shenzhen Guowei Fuxin Technology Co.,Ltd,Shenzhen 518000,China;School of Computer Sciences,Shenzhen Institute of Information Technology,Shenzhen 518000,China;Institute of Information Technology,Shenzhen Institute of Information Technology,Shenzhen 518000,China;School of Computer Science and Technology,Harbin Institute of Technology(Shenzhen),Shenzhen 518000,China)
出处
《通信学报》
EI
CSCD
北大核心
2023年第3期209-219,共11页
Journal on Communications
基金
国家自然科学基金资助项目(No.61906124,No.62031013)
中国博士后科学基金资助项目(No.2018M630158)
广东省自然科学基金资助项目(No.2022A1515011447)。
关键词
重构
低秩
投影
稀疏
特征选择
reconstruction
low-rank
projection
sparsity
feature selection