摘要
特征选择是在损失较少信息的情况下处理高维图像数据的关键技术,是高维数据预处理的重要步骤.通过引入Fisher判别分析(Fisher Discriminant Analysis,FDA)和典型相关分析(Canonical Correlation Analysis,CCA)的思想,采用以样本的类标号形式给出的先验信息,考虑样本数据的局部性,提出了一种监督的基于Fisher判别信息的局部保持多投影向量分析方法(Locality Preserving Multi-projection Vector Fisher Discriminant Analysis,LPMVF).通过定义新准则,LPMVF具有以下优点:(1)便于计算,可有效避免奇异性;(2)借助标准核映射,可快速将LPMVF推广到非线性的特征空间;(3)与CCA算法类似,LPMVF最终得到一对投影变换,可有效嵌入样本数据,可将原始数据投影成一系列"有用的"特征形式,并使数据的投影在嵌入空间中更具可分离性;(4)与局部化的Fisher判别分析(Local Fisher Discriminant Analysis,简称LFDA)相比,LPMVF也能够有效保持数据样本间的局部近邻关系;(5)在大多数情况下,该文算法的学习能力甚至优于经典的FDA、KFD和LFDA算法.在几个标准数据集上的实验结果表明,LPMVF及其非线性的推广算法能够提取出描述能力更强的特征信息,可有效利用类标号监督信息提高分类性能.
Feature selection has been an important preprocessing step in high-dimensional image data analysis without losing much intrinsic information. By introducing the ideas of Fisher Discriminant Analysis (FDA) and Canonical Correlation Analysis (CCA), the paper discusses the supervised feature selection problem where samples are accompanied with class labels and proposes a new Locality Preserving Multi-projection Vector Fisher Discriminant Analysis algorithm called LPMVF. LPMVF takes the local structure of the original data into account, so the multimodal samples data can be embedded appropriately. By defining the new guidelines, LPMVF has the following advantages: (1) LPMVF can be easily computed and can avoid the singular problems (2) LPMVF can be easily extended to non-linear feature selection scenarios by employing the kernel trick (3) Similar to CCA, LPMVF attempts to find two sets of basis vectors for two multivariate datasets of different classes, one for each class, which can project the original data onto a set of more useful features in the found embedding space, which would be benefit to classification and pattern recognition (4) The same with Local Fisher Discriminant Analysis (LFDA), LPMVF can preserve the local relationships between the data points (5) In most eases, the learning performance of the LPMVF method is superior to those of the classical FDA, KFD and LFDA algorithms. The authors verify the feasibility and effectiveness of LPMVF by extensive visualization and classification tasks. Experimental results on the benchmark dat, asets show that LPMVF and its nonlinear extended algorithm can extract the good features and effectively improve the accuracy by introducing the class labels as priori knowledge.
出处
《计算机学报》
EI
CSCD
北大核心
2010年第5期865-876,共12页
Chinese Journal of Computers
基金
江苏省自然科学基金(BK2009393)
国家自然科学基金(30671639)
江苏省高校科技创新计划项目(164070265)
南京林业大学科技创新项目(2009106)
2009年江苏省研究生创新基金(CX09S_013Z)资助~~
关键词
局部保持
多投影向量
特征选择
分类
判别分析
locality preservation multi-projection vector
feature selection classification discriminant analysis