期刊文献+

基于多级文本检测的复杂文档图像扭曲矫正算法 被引量:3

Distortion Correction Algorithm for Complex Document Image Based on Multi-level Text Detection
下载PDF
导出
摘要 文档的扭曲矫正是进行文档OCR(Optical Character Recognition)的基础步骤,对提高OCR的准确率有重要作用。文档图像的扭曲矫正常常依赖于文本的提取,然而目前文档图像矫正算法大都无法对复杂文档中的文本进行准确定位和分析,导致其矫正效果不理想。针对此问题,提出了一种基于全卷积网络的文字检测框架,并使用合成文档对网络进行针对性训练,可实现对字符、词、文本行三级文本信息的准确获取,进而对文本进行自适应采样并利用三次函数对页面进行三维建模,将矫正问题转化为模型参数优化问题,达到矫正复杂文档图像的目的。使用合成扭曲文档以及真实测试数据进行矫正实验,结果表明,提出的矫正方法能够对复杂文档进行精确的文本提取,明显改善了复杂文档图像矫正后的视觉效果,相比于其他算法,该算法矫正后OCR的准确率得到显著提高。 Document distortion correction is the basic step of document OCR(optical character recognition),which plays an important role in improving the accuracy of OCR.Document image distortion correction often depends on text extraction.However,most of the current document image correction algorithms cannot accurately locate and analyze the text in complex documents,resulting in unsatisfactory correction effects.To address this problem,a text detection framework based on a fully convolutional network is proposed,and the synthetic document is used to train the network to achieve accurate acquisition of three-level text information of characters,words,and text lines.A self-adaptive sampling of text and three-dimensional modeling of the page using a cubic function will transform the correction problem into a model parameter optimization problem to achieve the purpose of correcting complex document images.Correction experiments using synthetic distortion documents and real test data show that the proposed correction method can accurately extract text from complex documents,significantly improve the visual effect of complex document image correction.Compared with other algorithms,the accuracy rate of OCR after correction significantly increases.
作者 寇喜超 张鸿锐 冯杰 郑雅羽 KOU Xi-chao;ZHANG Hong-rui;FENG Jie;ZHENG Ya-yu(College of Information Engineering,Zhejiang University of Technology,Hangzhou 310023,China;School of Informatics Science and Technology,Zhejiang Sci-Tech University,Hangzhou 310018,China)
出处 《计算机科学》 CSCD 北大核心 2021年第12期249-255,共7页 Computer Science
基金 国家自然科学基金(61501402)。
关键词 卷积神经网络 文本检测 文档三维建模 文档图像矫正 光学字符识别 Convolutional neural network Text detection Three-dimensional modeling of documents Document image correction Optical character recognition
  • 相关文献

参考文献3

二级参考文献20

  • 1张伟业,赵群飞.读书机器人的版面分析及文字图像预处理算法[J].微型电脑应用,2011(1):58-61. 被引量:8
  • 2Brown M S, Seales W B. Image Restoration of Arbitrarily Warped Documents[J]. IEEE Transactions on Pattern Analysis and Machine/ntelligence, 2004, 26(10): 1295-1306.
  • 3Fu Bin, Wu Minghui, Li Rongfeng, et al. A Model-based Book Dewarping Method Using Text Line Detection[C]//Proc. of the 2nd International Workshop on Camera-based Document Analysis and Recognition. Curitiba, Brazil: [s. n.], 2007.
  • 4Zhang Zheng, Tan Chew Lira. Restoration of Images Scanned from Thick Bound Documents[C]//Proc. of 2001 International Conference on Image Processing. Thessaloniki, Greece: [s. n.], 2001.
  • 5Gatos B, Pratikakis I, Ntirogiannis K. Segmentation-based Recovery of Arbitrarily Warped Document Images[C]//Proc. of the 9th International Conference on Document Analysis and Recognition. Curifiba, Brazil:[s. n.], 2007.
  • 6Gatos B, Pratikakis I, Perantonis S J. Adaptive Degraded Document Image Binarization[J]. Pattern Recognition, 2006, 39(3): 317-327.
  • 7田学东,马兴杰,韩磊,刘海博.视觉文档图像的几何校正[J].计算机应用,2007,27(12):3045-3047. 被引量:10
  • 8HE Yuan, PAN Pan, XIE Shufu, et al. A book dewarping system by boundary-based 3D surface reconstruction [C] // 12th International Conference on Document Analysis and Recog nition, 2013: 403-407.
  • 9LI Zhang, Andy M Yip, Michael S Brown, et al. A unified framework for document restoration using inpainting and shape- from-shading [J].Pattern Recognit J, 2009, 42 (11): 2961-2978.
  • 10MENG Gaofeng, PAN Chunhong, XIANG Shiming, et al. Metric rectification of curved document images [J]. Pattern Analysis and Machine Intelligence, 2012, 34 (4): 707-722.

共引文献12

同被引文献13

引证文献3

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部