摘要
线性判别分析(Linear Discriminant Analysis,LDA)是用于降维和分类的方法,然而在遇到小样本问题时,由于全局散布矩阵是奇异的,所以传统的LDA方法是不适用的。为了解决LDA的这种缺点,提出了基于最小二乘线性判别分析(Least Squares Linear Discriminant Analysis,LS-LDA)的正则化算法,在LS-LDA中分别加入关于加权矩阵的L1范数、L2范数和弹性网络的惩罚项、来解决小样本问题,使模型具有鲁棒性和稀疏性。在对回归分析、正则化方法和LS-LDA相关技术进行深入分析的基础上,构建正则化最小二乘线性判别分析框架算法,实现数据降维。结合标准文本数据集进行实验,采用KNN(K-Nearest-Neighbor)分类器进行文本分类。实验结果表明,正则化的LS-LDA具有很好的分类性能,其中以加入了弹性网络惩罚项的LS-LDA最优。
Linear Discriminant Analysis (LDA) is a well-known technique for dimensionality reduction and classi fication, while the classical LDA formulation fails when the total scatter matrix is singular, encountered usually in un dersampled problems. In this paper, regularized Least Squares LDA (RLS-LDA) based on L2-norm, Ll-norm and the elastic net, is proposed to handle the problems, the resulting models are robust and sparse. Firstly, the theories about linear regression and regularization are explored, and the equivalence relationship between the least squares formulation and LDA for multi-class classifications under a mild condition is summarized. Secondly, the construction of RLS-LDA is presented. Performance evaluations of these approaches are conducted on benchmark collection of text documents. Results demonstrate the effectiveness of the proposed RLS-LDA and it's the RLS-LDA based on the elastic net that is better than others.
出处
《江西电力职业技术学院学报》
CAS
2010年第1期35-39,共5页
Journal of Jiangxi Vocational and Technical College of Electricity
基金
江西省教育厅科技项目(GJJ10446)