摘要
文章旨在研究数据分布未知的高维、小样本问题的特征抽取算法.基于支持向量机原理和特征统计不相关思想,提出基于散度支持向量机(SSVM)的递归统计不相关特征抽取算法,解决现有算法抽取特征之间存在相关性、算法受到样本分布影响等问题.针对高维小样本问题,使用PCA把SSVM优化问题变换到同构低维空间;给出边界鉴别向量集的递归求取方法,把模式高维特征投影到边界鉴别向量集,实现了统计不相关特征的抽取;分析了算法的收敛性和终止条件.文中使用核方法把线性SSVM推广到非线性SSVM,通过KPCA方法把非线性SSVM优化问题转换到低维空间中的等价优化问题,在低维空间抽取不相关非线性特征.仿真结果证明了文中算法的有效性.
A feature extraction algorithm for high dimensional data with unknown distribution and small sample size problem is discussed in this paper. Based on support vector machines and the idea of uncorrelated features, a scatter support vector machine (SSVM)-based recursive uncorrelated feature extraction algorithm is presented to deal with drawbacks of existing algorithms, such as correlations among extracted features, performance decrease from distribution of samples etc. To cope with small sample size problem, the optimization problem of SSVM is transformed into that in isomorphic lower dimension space through PCA. Then the method of recursively extracting margin discriminant vectors is proposed, and the uncorrelated features can be yielded by projecting the data in margin discriminant vectors; Finally, the convergence and termination condition of the proposed algorithm are analyzed. The algorithm can be generalized into nonlinear cases through kernel methods, the optimization problem of nonlinear SSVM can be transformed into equivlent optimization problem in lower dimension through KPCA, and then uncorrelated nonlinear features can be extracted. The simulation results demonstrate the efficiencies of the proposed algorithm.
出处
《计算机学报》
EI
CSCD
北大核心
2011年第3期443-451,共9页
Chinese Journal of Computers
基金
徐州师范大学培育项目(08XLY10)
中国博士后科学基金(20060390277)
江苏省“六大人才高峰”计划(06-E-05)资助~~
关键词
散度支持向量机(SSVM)
分类
特征抽取
统计不相关边界鉴别向量
主元分析(PCA)
scatter support vector machine (SSVM)
classification
feature extraction
uncorrelated margin discriminant vectors
principal component analysis (PCA)