摘要
提出了一种基于概率主成分分析模型(PPCA)的压印字符图像子空间维数的确定方法。首先,建立观测数据的PPCA模型;然后采用仿真数据进行仿真,对影响维数判别的各种因素进行了分析并给出了3种准则的适用范围;最后对压印字符数据集协方差矩阵的特征值曲线得到本征维数的大致区间范围,通过AIC、BIC和CAIC模型选择准则分别进行最优维数确定。实验表明,该方法可以提高算法的鲁棒性,有效地降低算法的运行时间。
A central issue in principal component analysis(PCA) is to choose the number of principal components to retain.However,most studies assume a known dimension or determine it heuristically,though there are a number of model selection criteria.In this paper,the probabilistic reformulation of PCA is used and a model selection criterion to determine the intrinsic dimensionality of data including Akaike′s information criterion(AIC),the consistent Akaika′s information criterion(CAIC),and the Bayesian inference criterion(BIC) is derived.These parameters which could affect the model selection are analyzed in detail.To estimate the intrinsic dimension of protuberant character images,the rough ranges of the intrinsic dimension is got in the first step and the optimum dimension is estimated in the second step.Experimental result shows that this algorithm is robust and it can effectively decrease the running time.
出处
《光电子.激光》
EI
CAS
CSCD
北大核心
2010年第5期754-757,共4页
Journal of Optoelectronics·Laser
基金
教育部博士点基金资助项目(20060422011)
山东省自然科学基金资助项目(Q2008G02)
西北农林科技大学人才专项资金资助项目(Z111020905)
关键词
主成分分析(PCA)
概率主分量分析(PPCA)
本征维数
维数估计
压印字符图像
principal component analysis(PCA)
probabilistic principal component analysis(PPCA)
intrinsic dimensions
dimension estimation
protuberant character image