基于L_(2,0)范数约束和冗余度学习的无监督特征选择算法

Unsupervised Feature Selection Algorithm Based on L_(2,0)Norm Constraint and Redundancy Learning

下载PDF

导出

摘要为了更好地消除特征间的冗余,结合稀疏学习,提出一种融合特征冗余度学习的稀疏无监督特征选择算法。首先,该算法利用L1范数度量投影数据点与聚类标签之间的损失,引入辅助变量将聚类标签的编码矩阵的正交性与非负性分离,确保编码矩阵是非负的且更接近理想的标签;其次,利用余弦相似度方法构造特征的冗余度矩阵,并将其作为正则项约束来学习投影矩阵;最后,通过L_(2,0)范数约束投影矩阵,可以恰好得到它的k个非零行,进而选出原始数据的k个特征。由此得到基于L_(2,0)范数约束和特征冗余度学习的稀疏无监督特征选择模型。所提算法在12个公开数据集上与10个相关算法进行比较,实验结果表明该算法在多数情况下可以选出更具判别性的特征。 In order to eliminate the redundancy between features efficiently,a sparse unsupervised feature selection algorithm,which integrated the feature redundancy learning and the sparse constraints,was proposed.Firstly,a sparse feature learning algorithm was presented,which used L1 norm to measure the loss between the projection data points and the clustering labels.Moreover,the auxiliary variable was introduced to separate the orthogonality and nonnegativity from the coding matrix of cluster labels matrix,so as to ensure that the coding matrix was nonnegative and was closer to the ideal label.Secondly,the cosine similarity was used to construct the redundancy matrix of features,and the projection matrix was studied as a regular term constraint for the reduction of dependence among features.Finally,by constraining the projection matrix with L_(2,0)norm,the k non-zero rows could be exactly obtained,and then the k features of the original data could be selected.Therefore,a sparse unsupervised feature selection model based on L_(2,0)norm constraint and feature redundancy learning could be obtained.A large number of comparative experiments were carried out on 10 related algorithms and 12 public datasets.The experimental results showed that the discriminative features could be selected by the proposed algorithm in most cases.

作者蒙莹莹李巧艳杨小飞袁林 MENG Yingying;LI Qiaoyan;YANG Xiaofei;YUAN Lin(School of Science,Xi′an Polytechnic University,Xi′an 710600,China)

机构地区西安工程大学理学院

出处《郑州大学学报（理学版）》 CAS 北大核心 2023年第5期81-88,共8页 Journal of Zhengzhou University:Natural Science Edition

基金国家自然科学基金项目(61976130) 陕西省重点研发计划项目(2018KW-021) 陕西省自然科学基金项目(2022KRM170)。

关键词特征选择稀疏学习特征冗余矩阵分解无监督学习 feature selection sparse learning feature redundancy matrix factorization unsupervised learning

分类号 TP18 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

参考文献3

1刘吉超,王锋.基于Relief-F的半监督特征选择算法[J].郑州大学学报（理学版）,2021,53(1):42-46. 被引量：6
2王飒,王克勇,郑链.Feature Selection via Analysis of Relevance and Redundancy[J].Journal of Beijing Institute of Technology,2008,17(3):300-304. 被引量：2
3朱恒东,马盈仓,张要,张宁.基于L21范数和回归正则项的半监督聚类算法[J].郑州大学学报（理学版）,2020,52(4):67-74. 被引量：5

二级参考文献24

1Xing E P,Jordan MI,Krap R M.Feature selection for high-di mensional genomic microarray data[].Proceed- ings of the th International Conference on MachineLearning.2001
2John G H,Kohavi R,Pfleger K.Irrelevant features and the subset selection problem[].Proceeding of the th International Conference on Machine Leaning.1994
3Webb A R.Statistical pattern recognition[ M][]..2002
4Peng H C,Long F H,Ding C.Feature selection based on mutual information: criterion of max-dependency , max-relevance and min-redundancy[].IEEE Transac- tion on Pattern Analysis and Machine Intelligence.2005
5Lecun Y,Jackel L,Bottou L, et al.USPS database. http:∥www.kernel .org/data .html .
6Merz C J,Murphy P M.UCI repository of machine learning databases[ EB/OL]. http:∥www.ics .uci . edu/ ~mlearn/ MLRepository .html .
7Jain A K,,Duin R P W,Mao J.Statistical patternrecog- nition:a review[].IEEE Transaction on Pattern Anal- ysis and Machine Intelligence.2000
8Langley P.Selection of relevant features in machine learning[].Proceedings of the AAAI Fall Symposium on Relevance.1994
9Yu L,Liu H.Efficient feature selection via analysis of relevance and redundancy[].Journal of Machine Learn- ing Research.2004
10Cover T M,Thomas J A.Elements of Information Theory[]..1991

共引文献10

1张远鹏,蔡可夫,姚敏,姚登福,王理.基于深度堆叠式稀疏回归的癫痫患者脑电信号特征选择[J].南通大学学报（医学版）,2021,41(3):212-216. 被引量：1
2朱恒东,马盈仓.标记判别和局部线性强化的半监督稀疏子空间聚类[J].计算机应用研究,2021,38(10):3014-3018. 被引量：1
3赵静,闫春雨,杨东建,温昱婷,黎文华,鲁力群,兰玉彬.基于无人机多光谱遥感的台风灾后玉米倒伏信息提取[J].农业工程学报,2021,37(24):56-64. 被引量：12
4唐顺田.基于半监督聚类算法的水利枢纽工程设备自适应PID控制系统[J].工业仪表与自动化装置,2022(4):112-117. 被引量：2
5王雷,杜亮,周芃,吴鹏.基于自步学习的对称非负矩阵分解算法[J].郑州大学学报（理学版）,2022,54(5):43-48.
6刘洋宇.基于Relief算法的智能车辆牌照模糊识别方法[J].吉林大学学报（信息科学版）,2023,41(1):158-164.
7霍轩琳,牛振国,张波,刘林崧,李霞.高寒湿地分类的遥感特征优选研究[J].遥感学报,2023,27(4):1045-1060. 被引量：4
8靳炳烨,王锋,魏巍.半监督Relief-F特征选择算法[J].河北师范大学学报（自然科学版）,2023,47(4):348-353. 被引量：2
9陈慧,陈适,郭银婷,连淑婷,王康,韦先灿.基于正则自编码器及Optuna寻优的异常用电数据清洗研究[J].电力需求侧管理,2023,25(5):53-58. 被引量：2
10段书用,杨建华,韩旭,刘桂荣.高维数据自适应降维方法[J].机械工程学报,2024,60(17):283-296.

1汪玲,刘彩霞.基于Web of Science肺癌手术联合新辅助免疫治疗研究的可视化分析[J].全科护理,2023,21(22):3025-3029.
2陈阿楠,钱海蓉.基于CiteSpace可视化分析软件对神经内科住院医师规范化培训领域的热点研究及探索[J].中国毕业后医学教育,2023,7(7):555-560.
3陈广福,连雁平,李晓飞.基于三阶路径自适应度惩罚的链路预测方法[J].四川轻化工大学学报（自然科学版）,2023,36(3):59-67.
4杨国元,白伟,王小书,沈海燕.基于本体模型的铁路客运车站应急处置方案推荐方法研究[J].铁道运输与经济,2023,45(6):106-112.
5许娜.图像识别技术在高校教学管理中应用研究[J].新潮电子,2023(7):196-198.

郑州大学学报（理学版）

2023年第5期

浏览历史

内容加载中请稍等...

基于L_(2,0)范数约束和冗余度学习的无监督特征选择算法

参考文献3

二级参考文献24

共引文献10

相关作者

相关机构

相关主题

浏览历史