期刊文献+

基于L_(2,0)范数约束和冗余度学习的无监督特征选择算法

Unsupervised Feature Selection Algorithm Based on L_(2,0)Norm Constraint and Redundancy Learning
下载PDF
导出
摘要 为了更好地消除特征间的冗余,结合稀疏学习,提出一种融合特征冗余度学习的稀疏无监督特征选择算法。首先,该算法利用L1范数度量投影数据点与聚类标签之间的损失,引入辅助变量将聚类标签的编码矩阵的正交性与非负性分离,确保编码矩阵是非负的且更接近理想的标签;其次,利用余弦相似度方法构造特征的冗余度矩阵,并将其作为正则项约束来学习投影矩阵;最后,通过L_(2,0)范数约束投影矩阵,可以恰好得到它的k个非零行,进而选出原始数据的k个特征。由此得到基于L_(2,0)范数约束和特征冗余度学习的稀疏无监督特征选择模型。所提算法在12个公开数据集上与10个相关算法进行比较,实验结果表明该算法在多数情况下可以选出更具判别性的特征。 In order to eliminate the redundancy between features efficiently,a sparse unsupervised feature selection algorithm,which integrated the feature redundancy learning and the sparse constraints,was proposed.Firstly,a sparse feature learning algorithm was presented,which used L1 norm to measure the loss between the projection data points and the clustering labels.Moreover,the auxiliary variable was introduced to separate the orthogonality and nonnegativity from the coding matrix of cluster labels matrix,so as to ensure that the coding matrix was nonnegative and was closer to the ideal label.Secondly,the cosine similarity was used to construct the redundancy matrix of features,and the projection matrix was studied as a regular term constraint for the reduction of dependence among features.Finally,by constraining the projection matrix with L_(2,0)norm,the k non-zero rows could be exactly obtained,and then the k features of the original data could be selected.Therefore,a sparse unsupervised feature selection model based on L_(2,0)norm constraint and feature redundancy learning could be obtained.A large number of comparative experiments were carried out on 10 related algorithms and 12 public datasets.The experimental results showed that the discriminative features could be selected by the proposed algorithm in most cases.
作者 蒙莹莹 李巧艳 杨小飞 袁林 MENG Yingying;LI Qiaoyan;YANG Xiaofei;YUAN Lin(School of Science,Xi′an Polytechnic University,Xi′an 710600,China)
出处 《郑州大学学报(理学版)》 CAS 北大核心 2023年第5期81-88,共8页 Journal of Zhengzhou University:Natural Science Edition
基金 国家自然科学基金项目(61976130) 陕西省重点研发计划项目(2018KW-021) 陕西省自然科学基金项目(2022KRM170)。
关键词 特征选择 稀疏学习 特征冗余 矩阵分解 无监督学习 feature selection sparse learning feature redundancy matrix factorization unsupervised learning
  • 相关文献

参考文献3

二级参考文献24

  • 1Xing E P,Jordan MI,Krap R M.Feature selection for high-di mensional genomic microarray data[].Proceed- ings of the th International Conference on MachineLearning.2001
  • 2John G H,Kohavi R,Pfleger K.Irrelevant features and the subset selection problem[].Proceeding of the th International Conference on Machine Leaning.1994
  • 3Webb A R.Statistical pattern recognition[ M][]..2002
  • 4Peng H C,Long F H,Ding C.Feature selection based on mutual information: criterion of max-dependency , max-relevance and min-redundancy[].IEEE Transac- tion on Pattern Analysis and Machine Intelligence.2005
  • 5Lecun Y,Jackel L,Bottou L, et al.USPS database. http:∥www.kernel .org/data .html .
  • 6Merz C J,Murphy P M.UCI repository of machine learning databases[ EB/OL]. http:∥www.ics .uci . edu/ ~mlearn/ MLRepository .html .
  • 7Jain A K,,Duin R P W,Mao J.Statistical patternrecog- nition:a review[].IEEE Transaction on Pattern Anal- ysis and Machine Intelligence.2000
  • 8Langley P.Selection of relevant features in machine learning[].Proceedings of the AAAI Fall Symposium on Relevance.1994
  • 9Yu L,Liu H.Efficient feature selection via analysis of relevance and redundancy[].Journal of Machine Learn- ing Research.2004
  • 10Cover T M,Thomas J A.Elements of Information Theory[]..1991

共引文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部