摘要
主元分析是一种广泛应用的多元统计技术.在处理高维数据时,其结果的统计一致性与物理可解释性难以保证.引入以变量选择为目标的稀疏性约束,可有效缓解上述困难.基于最近10年的研究进展,本文阐述了稀疏性的基本概念和罚函数的设计标准,介绍了经典的稀疏性约束lasso及其多个变种:融合lasso、成组lasso、自适应lasso、弹性网等等.Lasso及其变种均可用作主元分析的约束,构建稀疏主元分析框架,但关键在于如何将稀疏主元转化为凸优化问题并快速求解.本文比较了稀疏主元的多种转化形式:奇异值分解、稀疏回归、低阶秩逼近、罚矩阵分解和半正定松弛.分析了基于最小角回归算法的一般lasso及广义lasso问题的求解方法.此外还初步探讨了函数型数据的稀疏主元分析问题.
Principal component analysis (PCA) is a popular multivariate statistic technique. However, the principal compo- nent estimation is often inconsistent while the samples are high-dimensional, and the principal component meaning is unintelligible too. The above two difficulties can be partially overcome by variable selection with sparse conslraints. The basic concept of sparsity and the design standard of penalties were described in this survey. A typical sparse constraint, lasso, was introduced as well as its re- lated morphs:fused lasso, group lasso, adaptive lasso and elastic net. Any of these constraints can be added into PCA to build a framework of spars PCA, and the emphasis was on how to transform sparse PCA into a convex optimizing problem and quickly solve it.Many transforming styles on sparse PCA were compared: singular value decomposition, sparse regression, low rank matrix approximation, penalized matrix decomposition and semi-definite relaxations. The approaches to solving the common and generalized lasso problems were analyzed based on least angle regression (LAR). The element of sparse PCA in functional data was discussed as a prospect.
出处
《电子学报》
EI
CAS
CSCD
北大核心
2012年第12期2525-2532,共8页
Acta Electronica Sinica
基金
国家自然科学基金资助项目(No.61101022)
国家科技支撑计划课题(No.2009BAF40B03)
关键词
稀疏性
主元分析
lasso
凸优化
sparsity
principal component analysis
lasso (least absolute shrinkage and selection operator)
convex optimiza- tion