摘要
为解决高维数据在分类时造成的“维数灾难”问题,提出一种新的将核函数与稀疏学习相结合的属性选择算法。具体地,首先将每一维属性利用核函数映射到核空间,在此高维核空间上执行线性属性选择,从而实现低维空间上的非线性属性选择;其次,对映射到核空间上的属性进行稀疏重构,得到原始数据集的一种稀疏表达方式;接着利用L 1范数构建属性评分选择机制,选出最优属性子集;最后,将属性选择后的数据用于分类实验。在公开数据集上的实验结果表明,该算法能够较好地实现属性选择,与对比算法相比分类准确率提高了约3%。
In order to solve the“dimension disaster”problem caused by high-dimensional data classification,the paper proposes a new feature selection algorithm combining kernel function with sparse learning.Specifically,the kernel function is firstly used to map every dimensional feature to the kernel space,and linear feature selection is performed in the high dimensional kernel space to achieve nonlinear feature selection in the low dimensional space.Secondly,sparse reconstruction is performed on the features mapped to the kernel space,so as to gain a sparse representation of the original dataset.Next,L 1-norm is used to construct a feature selection mechanism and selects the optimal feature subset.Finally,the data after the feature selection is used in the classification experiments.Experimental results on public datasets show that,compared with the comparison algorithm,the proposed algorithm can conduct the feature selection better and improve the classification accuracy by about 3%.
作者
吕治政
李扬定
雷聪
Lü Zhi-zheng;LI Yang-ding;LEI Cong(College of Computer Science and Information Engineering,Guangxi Normal University,Guilin 541004,China)
出处
《计算机工程与科学》
CSCD
北大核心
2020年第1期167-177,共11页
Computer Engineering & Science
基金
国家重点研发计划(2016YFB1000905)
国家自然科学基金(6117013120)
国家973项目(2013CB329404)
中国博士后科学基金(2015M570837)
广西自然科学基金(2015GXNSFCB139011)
关键词
属性选择
非线性
核函数
稀疏学习
L
1范数
feature selection
nonlinear
kernel function
sparse learning
L 1-norm