摘要
文字种类识别及字体辨别已成为继印刷体文字识别以后新的国内外研究的热点,关于单字的手写体和印刷体辨别的研究不多,但在表单中却极为常用。对于字体辨别问题,引入流形学习算法局部线性嵌套(LLE),假定数据为存在于嵌入高维空间的一个低维流形。提出了用于单字字体辨别的LLE泛化方法及邻域和内在维数的参数估计方法,基于印刷体/手写体汉字字符及数字的辨别实验表明,其性能优于直接支持向量机(SVM)分类,且经过LLE降维后的数据直接用线性判别分析方法(LDA)分类可以获得与LLE计算后SVM分类相近甚至更高的正确率和更快的分类速度。
The identification of language and character type has been an active area of research after recognition of machine printed text.Research on identification of handwritten text and printed text is seldom conducted.But it is common used in recognition of form.For character type identification,manifold learning algorithm Locally Linear Embedding (LLE) is imported.A generalizing method and a parameters estimation method are proposed.Experiments in identification printed/handwritten Chinese characters and digits show that its performance is higher than Support Vector Machine (SVM) classification.The combination of dimensionality reduction of LLE and Linear Discriminant Analysis (LDA) classification achieves a similar accurate rate as or higher than the combination of LLE and SVM classification but runs much faster than it.
出处
《计算机工程与应用》
CSCD
北大核心
2008年第6期206-209,共4页
Computer Engineering and Applications
基金
湖北省重点新产品计划资助项目(No.2003BDST004)。
关键词
字体辨别
流形学习
局部缌胜嵌入(LLE)
参数估计
identification of character type
manifold learning
Locally Linear Embedding(LLE )
parameter estimation