基于稀疏系数矩阵重构的多标记特征选择被引量：3

Multi-Label Feature Selection Based on Sparse Coefficient Matrix Reconstruction

下载PDF

导出

摘要处理复杂的多标记数据对于特征选择而言是一项挑战性任务.然而,现存的多标记特征选择方法存在三个问题未解决.首先,现有的多标记特征选择方法利用样例层流形正则化项保持样例的相似性结构或借助标签关联来指导特征选择,但两者对于特征选择的指导存在互补关系.其次,早期方法基于样例相似性所构造的近邻矩阵来探索标签关联,却忽略了成对标签本身的关联性.最后,早期方法整合多个未知变量,导致目标函数的求解变得困难.为解决上述问题,本文基于最小二乘回归模型构建经验损失函数,然后在目标函数中引入标签正则化项探索标签之间的关联,同时利用特征矩阵与重构稀疏系数矩阵的乘积表示预测标签并保留数据本身的局部几何结构.上述各项被整合在一个联合学习框架内.针对该学习框架,一套证明可收敛的优化方案被设计.在13个真实的多标记基准数据集上进行实验,实验结果验证了所提方法的有效性. Dealing with complicated multi-label data is a challenging task for feature selection in practical applications.However,there exist three unsolved issues in the existing multi-label feature selection methods.First,previous multi-label feature selection methods either employ instance-level manifold regularization terms to maintain the instance similarity or exploit the correlations among labels to guide feature selection process,however,both two are complementary to each other in feature selection process.Second,existing methods explore label correlations based on the affinity matrix of instance similarity,ignoring the pairwise label correlations.Third,previous methods involve several unknown variables,which makes the solution of the objective function difficult.To tackle the issues mentioned above,an empirical loss function model is constructed based on the least square regression model.And then,we introduce the label regularization term to exploit label correlations,meanwhile employing the product of feature matrix and weight coefficient matrix to represent predicted labels so that the local geometric structure of data set is stored.Finally,we integrate the terms mentioned above into one joint learning framework.An effective optimization method with provable convergence is designed to solve our proposed method.In summary,the novelties and main contributions of this paper can be summarized as follows:the proposed method uses the instance-level manifold regularization term to maintain the instance similarity.At the same time,the proposed method introduces label-level manifold regularization term to exploit the label correlations.Moreover,the proposed method can store the geometric structure of labels in the weight coefficient matrix and employ the weight coefficient matrix to guide feature selection process,because the sparse coefficient matrix can maintain the geometric relationship between the data space and label space,as well as the relationship between labels,the proposed method can obtain superior classification ability on the test data set by using the sparse coefficient matrix that is learned by the training process.Furthermore,the proposed method introduces the L_(2,1)-norm that integrates the advantages of L_(1)-norm and L_(2)-norm to select important features in each iteration.Finally,the proposed method integrates all the above terms into one joint learning framework and develops a method to solve the constrained problem,i.e.,regulating regression coefficient matrix based on instance-similarity and label-similarity for multi-label feature selection that is named as RMLFS,while an optimal scheme is designed.In addition,we can obtain a globally optimal solution by this learning framework because the objective function only incorporates one unknown variable unlike other existing methods that incorporate multiple unknown variables that lead to the local optimal solution in most cases,and the objective function is a convex function.This method conducts multiple evaluation criteria on thirteen benchmark data sets to show the superiority of the proposed multi-label feature selection method.In order to verify the classification superiority of the proposed method,numerous experiments are conducted on thirteen different multi-label data sets.Eight competitive methods including MIFS,MDMR,SCLS,LRFS,mRMR,RALM-FS,TRCFS and GMM are compared to the proposed method.The extensive experimental results show that the classification performance of the proposed RMLFS outperforms other compared methods in these experiments.

作者李永豪胡亮高万夫 LI Yong-Hao;HU Liang;GAO Wan-Fu(College of Computer Science and Technology,Jilin University,Changchun 130012;Key Laboratory of Symbolic Computation and Knowledge Engineering,Ministry of Education,Jilin University,Changchun 130012)

机构地区吉林大学计算机科学与技术学院吉林大学符号计算与知识工程教育部重点实验室

出处《计算机学报》 EI CAS CSCD 北大核心 2022年第9期1827-1841,共15页 Chinese Journal of Computers

基金国家重点研发专项(2017YFA0604500) 吉林省重点科技研发项目(20180201103GX) 吉林省科技厅联合基金项目(2020122209JC)资助.

关键词特征选择多标记学习流形学习稀疏化学习分类 feature selection multi-label learning manifold learning sparse learning classification

分类号 TP18 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

参考文献1

1刘海洋,王志海,张志东.基于ReliefF剪枝的多标记分类算法[J].计算机学报,2019,42(3):483-496. 被引量：9

共引文献8

1李嘉恩.基于Relief算法的激光干涉仪故障图像自动识别方法研究[J].自动化与仪器仪表,2020(12):29-32. 被引量：1
2郑英杰,吴松荣,韦若禹,涂振威,廖进,刘东.基于目标图像FCM算法的地铁定位点匹配及误报排除方法[J].浙江大学学报（工学版）,2021,55(3):586-593.
3沈俊鑫,吕佳历,程墙,张经阳.中国PPP项目可融资性差吗?——基于集成LightGBM-Blending算法[J].中国软科学,2022(1):50-61. 被引量：4
4孙林,陈雨生,徐久成.基于改进ReliefF的多标记特征选择算法[J].山东大学学报（理学版）,2022,57(4):1-11. 被引量：9
5孙林,杜雯娟,李硕,徐久成.基于标记相关性和ReliefF的多标记特征选择[J].西北大学学报（自然科学版）,2022,52(5):834-846. 被引量：6
6韩晶晶,刘江越,公维军,魏宏杨,钱育蓉.面向移动端的目标检测优化研究[J].计算机工程与应用,2022,58(24):12-28. 被引量：3
7刘洋宇.基于Relief算法的智能车辆牌照模糊识别方法[J].吉林大学学报（信息科学版）,2023,41(1):158-164.
8孙林,徐枫,李硕,王振.基于ReliefF和最大相关最小冗余的多标记特征选择[J].河南师范大学学报（自然科学版）,2023,51(6):21-29. 被引量：7

同被引文献19

1张振海,李士宁,李志刚,陈昊.一类基于信息熵的多标签特征选择算法[J].计算机研究与发展,2013,50(6):1177-1184. 被引量：62
2余鹰.多标记学习研究综述[J].计算机工程与应用,2015,51(17):20-27. 被引量：12
3李顺勇,王改变.一种新的最大相关最小冗余特征选择算法[J].智能系统学报,2021,16(4):649-661. 被引量：6
4陈宗淦,詹志辉.面向多峰优化问题的双层协同差分进化算法[J].计算机学报,2021,44(9):1806-1823. 被引量：15
5曾毓菁,姜勇.一种融入注意力和预测的特征选择SLAM算法[J].智能系统学报,2021,16(6):1039-1044. 被引量：2
6黎思泉,万永菁,蒋翠玲.基于生成对抗网络去影像的多基频估计算法[J].计算机科学,2022,49(3):179-184. 被引量：3
7陈彤,陈秀宏.特征自表达和图正则化的鲁棒无监督特征选择[J].智能系统学报,2022,17(2):286-294. 被引量：6
8章涛,张亚娟,孙刚,罗其俊.稀疏贝叶斯字典学习空时机动目标参数估计算法[J].电子与信息学报,2022,44(8):2884-2892. 被引量：6
9朱雨晨.简谱符号的特征识别[J].信息与电脑,2022,34(10):199-202. 被引量：1
10周宇航,侯进,李嘉新,李惠森.基于频域叠加和深度学习的频谱信号识别[J].计算机应用研究,2023,40(3):874-879. 被引量：4

引证文献3

1黎建宇,詹志辉.面向大规模特征选择的自监督数据驱动粒子群优化算法[J].智能系统学报,2023,18(1):194-206. 被引量：2
2余鹰,张志强,钱进,万明.基于标记补充的多标记特征选择算法[J].数据采集与处理,2023,38(3):539-548. 被引量：1
3刘佳楠.基于干扰信号剔除的全频段音乐和弦识别方法[J].赤峰学院学报（自然科学版）,2024,40(9):64-69.

二级引证文献3

1周鹏,刘河,黎隽男.基于近邻规则和粒子群优化的半监督自标记方法[J].统计与决策,2023,39(18):44-49.
2朱勇,陶用伟,李泽群.基于S变换与特征优选的电能质量扰动识别[J].电工技术,2023(21):97-100.
3陈曦,马建敏,刘权芳.基于模糊依赖决策熵的多标签特征选择[J].昆明理工大学学报（自然科学版）,2024,49(2):62-72.

1连滔.指向成长型思维模式的小学英语过程性评价实践[J].新课程研究,2022(19):54-56.
2周炼.在初中数学章前课中让大概念落地的基本策略[J].中小学教师培训,2022(9):37-42. 被引量：4
3唐长远.山区学校品牌的构建策略例析[J].师道（教研）,2022(7):13-14.
4马玉文,朱枫.共创·共治·共成长:自组织引领学生走向互促共赢[J].中小学管理,2022(6):53-56.
5孙瑞阳,秦金山,王欣悦,张艳,孙倩.基于投入产出视角内蒙古产业结构的关联影响研究[J].内蒙古工业大学学报（自然科学版）,2022,41(3):282-288.
6张金辉,汪鹏,李蕾.基于深度学习的跨对象脑电睡眠分期研究[J].北京生物医学工程,2022,41(4):399-404.
7叶瑞联.以成长型思维助推古诗词的深度学习[J].小学教学参考,2022(13):22-24.
8顾莹,邹莉,黄逸婷,冯月珍,田野,方晨.头颈部肿瘤围放疗期规范化营养治疗进展[J].中华放射肿瘤学杂志,2022,31(7):655-659. 被引量：1
9任银山,冉瑞生,房斌.基于广义矩阵指数的判别局部保持投影方法[J].重庆师范大学学报（自然科学版）,2022,39(4):114-123. 被引量：1
10任磊,刘燕.挑战性-阻碍性压力源对员工主动变革行为的影响机理[J].企业经济,2022,41(8):101-111. 被引量：8

计算机学报

2022年第9期

浏览历史

内容加载中请稍等...

基于稀疏系数矩阵重构的多标记特征选择被引量：3

参考文献1

共引文献8

同被引文献19

引证文献3

二级引证文献3

相关作者

相关机构

相关主题

浏览历史

基于稀疏系数矩阵重构的多标记特征选择 被引量：3

参考文献1

共引文献8

同被引文献19

引证文献3

二级引证文献3

相关作者

相关机构

相关主题

浏览历史

基于稀疏系数矩阵重构的多标记特征选择被引量：3