摘要
针对高维无标签数据中的特征冗余问题,提出一种基于特征正则稀疏关联的无监督特征选择方法(FRSA)。建立特征选择模型:利用Frobenius范数建立损失函数项表示特征之间的关联关系,对特征权重矩阵施加L_(1)稀疏正则化约束。设计一种分治-收缩阈值迭代算法对目标函数进行优化。根据特征权重评估每个特征的重要性,选择出具有代表性的特征。在6个不同类型的标准数据集上与目前常用的无监督特征选择方法进行对比实验,实验结果表明,所提方法的性能优于其它无监督特征选择方法。
Aiming at the problem of feature redundancy in high-dimensional unlabeled data,an unsupervised feature selection method based on feature regular sparse association(FRSA)was proposed.The Frobenius norm was used to establish the loss function term to express the relationship between the features,and the L_(1)sparse rule regularization constraint was applied to the feature weight matrix to improve its sparsity.A divide-and-conquer threshold iterative algorithm was designed to optimize the objective function.The importance of each feature was evaluated according to the feature weight,and representative features were selected.Comparing experiments with currently commonly used unsupervised feature selection methods on six different types of standard data sets,the experimental results show that the performance of the proposed method is better than that of other unsupervised feature selection methods.
作者
白圣子
降爱莲
BAI Sheng-zi;JIANG Ai-lian(College of Information and Computer,Taiyuan University of Technology,Jinzhong 030600,China)
出处
《计算机工程与设计》
北大核心
2022年第4期969-976,共8页
Computer Engineering and Design
基金
山西省回国留学人员科研基金项目(2017-051)。
关键词
特征选择
特征关联
稀疏表示
收缩阈值
正则化
feature selection
feature association
sparse representation
shrinkage thresholding
regularization