摘要
典型相关分析(canonical correlation analysis,CCA)是一种寻求同一对象的两组变量之间最大相关性的多元统计方法,其基于L2范数的最小均方误差(mean square error,MSE)的准则函数对于野值点非鲁棒。广义均值不仅在理论上被证明是鲁棒的,而且在聚类和对象识别等应用中获得了有效性验证。将广义均值应用于CCA,提出了一种基于广义均值的鲁棒CCA(CCA based on generalized mean,GMCCA),成功克服了CCA对野值点敏感的不足。一方面,通过抑制野值点对准则函数的影响,达到鲁棒的效果。另一方面,GMCCA避免了高维小样本导致协方差矩阵奇异的问题。在多特征手写体数据库(multiple feature database MFD)、人脸数据库(ORL)和对象图像数据库(COIL-20)上的实验结果验证了该算法的有效性。
Canonical correlation analysis(CCA)is a multivariate statistical analysis method which aims at searching for the linear correlation between two sets of variables of same object.And the criterion function based on L2norm of minimum mean square error used in CCA results in robustness problem.Generalized mean has been proved to be robust in theory,and has received validation in some applications,such as clustering,object recognition.This paper develops a robust CCA based on generalized mean(GMCCA),which successfully overcomes the drawback that CCA is sensitive to outliers.The method not only inhibits the influence of outliers to achieve robust results,but also avoids the problem of singular covariance matrix in small size of samples.Experiments on multiple feature database(MFD),face database(ORL)and object database(COIL-20)demonstrate the effectiveness of GMCCA.
作者
顾高升
葛洪伟
周梦璇
GU Gaosheng;GE Hongwei;ZHOU Mengxuan(School of Internet of Things Engineering, Jiangnan University, Wuxi, Jiangsu 214122, China;Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), Jiangnan University,Wuxi, Jiangsu 214122, China)
出处
《计算机科学与探索》
CSCD
北大核心
2017年第7期1140-1149,共10页
Journal of Frontiers of Computer Science and Technology
基金
江苏省普通高校研究生科研创新计划项目No.KYLX15_1169
江苏高校优势学科建设工程资助项目~~
关键词
广义均值
均方误差
典型相关分析
鲁棒性
鲁棒典型相关分析
generalized mean
mean square error
canonical correlation analysis
robustness
robust canonical correlation analysis