摘要
针对高维小样本数据特征提取问题,通过融合主成份分析(PCA)和线性判别分析(LDA),提出一种鉴别主成份分析方法。通过对PCA主成份进行单个线性判别,选择主要反应类间差异的主成份来构造特征空间。对yeast和NCI基因表达数据的实验结果表明:该方法在降维的同时能获得较好的判别特征,且能避免线性判别分析方法的奇异性。在子空间的聚类识别率相比PCA提高了20%以上,且具有较好的可视化效果,说明了用该方法对高维小样本数据进行特征提取的有效性。
A method namely Discriminant Principal Component Analysis for feature extraction of small sample ofhigh-dimensionality data is proposed, in which PCA and LDA are fused. The presented method employs linear discriminant analysis for single principal component in the features derived from PCA. In this way, a subspace having the maximum inter-class variation and the minimum intra-class variation is established. The proposed method is used to reduce the dimensions and extract the features ofgene expression data yeast and NCI, and clustering is performed on the reduced data. Experimental results indicates that this method can obtain better discriminant features with reduced dimensions, avoid the singularity of linear discriminate analysis and it outperforms PCA on visualization and recognition accuracy increases more than 20%.
出处
《燕山大学学报》
CAS
2010年第5期426-430,共5页
Journal of Yanshan University
基金
国家自然科学基金资助项目(20875073)
郑州市重大攻关项目(072SGZS38042)