摘要
基于2D图像处理的文档图像校正技术通常先进行单词或字符分割,会导致算法复杂、耗时。针对该问题,提出一种改进的校正算法。运用向上优先区域生长法搜索行间留白,实现行间分割得到弯曲行图像。对弯曲行图像采用改进二遍扫描法标记字符四连通区域获得行基点,对行基点使用最小二乘法拟合三次多项式,并将多项式曲线拉直拼接出全图。OCR应用和对比实验结果表明,该算法可提高字符识别率,并且能得到较好的视觉效果。
Focusing on complicated and costly problems caused by segmentation of words or characters in document rectification algorithm based on 2D image processing techniques,this paper proposes an improved rectification algorithm. Firstly, it applies modified region growing method to search lines space, which implements the segmentation between lines and gets warped line images. Then it uses improved two-pass scanning method labels characters 4-connected components on each of curving line images, and uses least-squares to line base points to fit degree three polynomial. Finally, it straightens polynomial curve and stitches a whole image. Experiments on OCR and comparison with other methods show that the proposed algorithm can improve recognition ratio and produce visually pleasing output.
出处
《计算机工程》
CAS
CSCD
北大核心
2017年第4期277-280,286,共5页
Computer Engineering
基金
安徽师范大学人才培育基金(2011rcpy029)
关键词
卷积
最小二乘多项式拟合
二遍扫描法
区域生长法
图像拉直
convolution
least-squares polynomial fitting
two-pass scanning method
region growing method
image straightening