摘要
OCR光学字符识别技术在处理纸质信息与光电信息的交互问题上有着重要的作用。本文基于容量为20000份的公开光学字符识别集,在学习研究OCR光学字符识别技术情况下,探究识别字符所需的特征、使用基于倒数-高斯级联低通滤波降噪方法,经过二值化、粗细特征识别、垂直投影角变换以及字符信息归一化处理,将OCR光学字符识别流程各个步骤数学表达化,得到一组数学表达式。对表达式组整合、修改,从而建立得到识别光学字符的数学模型。然后评价并优化模型以及计算试验模型预测的正确率。
OCR optical character recognition technology plays an important role in dealing with the interaction between paper information and photoelectric information.Based on the open optical character recognition set with a capacity of 20,000 copies,this paper studies OCR optical character recognition technology,explores the features needed to recognize characters,uses the low-pass filtering denoising method based on reciprocal-Gauss cascade,after binarization,coarse and fine feature recognition,vertical projection angle transformation and normalization of character information,all steps of OCR optical character recognition process are discussed.Mathematical expression,get a set of mathematical expressions.By integrating and modifying expression groups,a mathematical model for recognizing optical characters is established.Then we evaluate and optimize the model and calculate the accuracy of the prediction of the experimental model.
作者
王爱爱
刘志立
WANG Ai-ai;LIU Zhi-li(North China University of Technology to Yisheng Innovation Education Base,Hebei Tangshan 063210,China)
出处
《新一代信息技术》
2019年第7期19-30,共12页
New Generation of Information Technology
关键词
光学符号识别
后处理系统
倒数-高斯级联低通滤波
灰度值
Inverse-Gaussian cascade low-pass filtering gray
value for optical symbol recognition
post-processing system