摘要
要提高脱机手写数字识别的识别率,关键是特征的提取与选择.主曲线是主成分分析的非线性推广,它是通过数据分布“中间”并满足“自相合”的光滑曲线.它较好地反映了数据分布的结构特征.在数字特征选取中,首先将主曲线用于训练数据的特征提取;其次在详细分析数字主曲线的结构特点的基础上,选择出用于数字识别的粗分类、细分类特征;最后在对手写数字进行识别时,先进行粗分类再进行细分类.所提方法在Concordia大学的CENPARMI手写体数字数据库上的实验结果表明:利用这些特征能有效区分相似字符,提高了手写数字的识别率,为脱机手写数字识别的研究提供了一条新途径.
Extraction and choice of features are critical to improving the recognition rate of off-line handwritten digits. Principal curves are nonlinear generalizations of principal components analysis. They are smooth self-consistent curves that pass through the “middle” of the distribution. They preferably reflect the structural features of the data. During digit feature selection, firstly principal curves are used to extract the structural features of training data; Secondly the classification features used for digits coarse classification and precise classification are chosen by analyzing the structural features of principal curves in detail; Finally coarse classification and precise classification are separately carried out in handwritten digits recognition. The Concordia University CENPARMI handwritten digit database is used in the experiment. The result of the experiment shows that these features have good discriminating power of similar digits. The proposed method can effectively improve the recognition rate of off-line handwritten digits and provide a new approach to the research for off-line handwritten digits recognition.
出处
《计算机研究与发展》
EI
CSCD
北大核心
2005年第8期1344-1349,共6页
Journal of Computer Research and Development
基金
国家自然科学基金项目(60175016
60475019)
国家"九七三"重点基础研究发展规划基金项目(2003CB316902)
上海市科委重大科技攻关基金项目(03DZ15029)
关键词
主曲线
结构特征
特征选取
principal curves
structural features
features extraction