期刊文献+

基于Gabor滤波器的数字文档图像文字提取算法 被引量:4

Gabor Filter Based Text Extraction from Digital Document Images
下载PDF
导出
摘要 本文提出一种在数字文档图像中自动检测和提取文字的算法.首先对图像在不同方向和阶数上进行Gabor滤波,得到反映文档图像布局的滤波图像,然后在得到的滤波图像中直接提取候选文字区域,再利用几何特性和高频分量特性筛选准则从中剔除非文字区域.最后选取了不同类型、不同语言和不同字体的文档图像进行实验,实验结果表明本算法对各种文档图像均能给出满意的结果. This paper presents an algorithm that can automatically detect and extract text in digital document images. Firstly, we process and fuse Gabor filtered images at different orientations and scales and obtain an image that reflects the layout of the document image. Then, potential text regions are directly extracted from the resulting image. Finally, two criteria based on the geometrical property and high frequency content are adopted to kick-out those non-text regions. The experiments are performed on some representative images with different styles and with texts in different languages and fonts. Experimental results show that the algorithm works well on document images from a wide variety of source.
机构地区 哈尔滨工业大学
出处 《电子学报》 EI CAS CSCD 北大核心 2006年第B12期2387-2390,共4页 Acta Electronica Sinica
关键词 文字提取 GABOR滤波器 数字文档图像 text extraction Gabor filter digital document images
  • 相关文献

参考文献14

  • 1F M Wahl, K Y Wong, R G Casey. Block segmentation and text extraction in mixed text/image document [ J ]. Computer Graphics and Image Processing, 1982,20(4) :375 - 390.
  • 2K Y Wong, R G Casey, F M Wahl. Document analysis system[J] .IBM Journal Res. Dev, 1982,26(6) :647 - 656.
  • 3D Wang, S N Srihari. Classification of newspaper image blocks using texture analysis [ J ]. Computer Graphics and Image Processing, 1989,47(3 ) :327 - 352.
  • 4L O'Gorman. The document spectrum for page layout analysis[J]. IEEE Trans Pattern Analysis and Machine Intelligence,1993,15(11) :1162- 1173.
  • 5A K Jain, S Bhattacharjee. Text segmentation using Gabor filters for automatic document processing[ J]. Machine Vision and Applications, 1992,5(3) : 169 - 184.
  • 6A K Jain, Y Zhong. Page segmentation using texture analysis[J] .PR, 1996,29(5) :743 - 770.
  • 7K Etemad, D Doermann, R Chellappa. Multiscale document page segmentation using soft decision integration [ J ]. IEEE Transactions on Pattern Analysis and Machine Intelligence,1997,19(1) :92 - 96.
  • 8S S Raju,P B Pati,A G Ramakrishnan. Gabor filter based block energy analysis for text extraction from digital document images[A]. Proceedings of the First International Workshop on Document Image Analysis for Libraries [ C ]. Palo Alto, California,USA: IEEE, 2004.233 - 243.
  • 9S S Raju, P B Pail, A G Ramakrishnan. Text localization and extraction from complex color images[ J]. International Symposium on Visual Computing 2005, LNCS-3804:486 - 493.
  • 10S Mao, T Kanungo. Emprirical performance evaluation methodology and its application to page segmentation algorithms[J] .PAMI,2001,23(3) :242 - 256.

同被引文献33

引证文献4

二级引证文献15

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部