摘要
针对传统OCR技术对自然场景下的图像文本识别率低的问题,设计和实现了自然场景的图像文字自动识别系统。该系统主要采用了改进的MSER场景文本定位算法,利用Tesseract对图像文字样本进行训练,然后进行场景图像文字识别,并在MFC环境下实现。实验结果表明,训练后的Tesseract库对场景图像文字识别有显著提升。
Aiming at the problem that the traditional OCR technology has low text recognition rate of natural scene image, the automatic text recognition system of natural scene image is designed and realized. The system mainly uses the improved MSER scene text localization algorithm, using Tesseract to train the image text samples, and then recognize the text of scene image, and realize the function in the MFC environment. The experimental results show that the trained Tesseract library has a significant improvement for the text recognition of natural scene image.
出处
《电脑知识与技术》
2017年第11X期213-216,共4页
Computer Knowledge and Technology
基金
浙江省大学生科技创新活动计划暨新苗人才计划项目(No.2017R417032)