期刊文献+

基于分层密度特征的文档图像检索 被引量:6

Document image retrieval based on multi-density features
原文传递
导出
摘要 为克服基于版面重建的文档图像检索方法对图像质量要求高,且局限于部分文种,以及基于版面分割的文档图像检索方法受限于版面分割技术等问题,提出了一种基于二值文档图像分层密度特征的检索方法。该方法通过倾斜校正、去除黑边等预处理得到有效文本区域,提取有效文本区域的长宽比和分层密度特征,通过特征比对进行检索。实验表明:该方法对不同分辨率以及不同的输入设备具有自适应能力,对复杂版面和批注等噪声鲁棒性好,漏检率为2%,是一种简单有效的文档图像检索方法。 The development of document image databases is challenging document image retrieval techniques. Traditional layout reconstructed-based methods rely on high quality document images and can only deal with several widely used languages. The complexity of document layouts greatly hinter layout analysis-based approaches. This paper describes a multi-density feature-based algorithm for binary document images, which is independent of optical character recognition (OCR) or layout analyses. The text area is extracted after preprocessing including skew correction and marginal noise removal. Then the aspect ratio and multi-density features are extracted from the text area to select the best candidates from the document image database. Experimental results show that this approach is simple With loss rates less than 2% and can efficiently analyze images with different resolutions and different input systems. The system is also robust to noise due to such as notes and complex layouts.
出处 《清华大学学报(自然科学版)》 EI CAS CSCD 北大核心 2006年第7期1231-1234,共4页 Journal of Tsinghua University(Science and Technology)
基金 国家自然科学基金资助项目(60472028) 国家教育部博士点基金项目(20040003015)
关键词 文档图像 图像检索 倾斜校正 分层密度特征 document image image retrieval skew correction multi-density features
  • 相关文献

参考文献7

  • 1PENG Hanchuan,LONG Fuhui,CHI Zheru,et al.Document image template matching based on component block list[J].Pattern Recognition Letters,2001,22(9):1033-1042.
  • 2Kise K,Yin W,Matsumoto K.Document image retrieval based on 2D density distributions of terms with pseudo relevance feedback[C]∥ Proc ICDAR.ACM,NY:[s.n.],2003:488-492.
  • 3PENG Hanchuan,LONG Fuhui,CHI Zheru.Document image recognition based on template matching of component block projections[J].Trans on Pattern Analysis and Machine Intelligence,IEEE,2003,25(9):1188-1192.
  • 4王姝华,李佐,蔡士杰.基于最小二乘法的文档图像倾斜检测方法[J].计算机应用与软件,2001,18(9):43-46. 被引量:15
  • 5汪同庆,朱永权,王洪.基于游长平滑的文档图像倾斜校正[J].计算机工程,2004,30(1):141-143. 被引量:11
  • 6FANA Kuo-Chin,WANGB Yuan-Kai,LAYA Tsann-Ran.Marginal noise removal of document images[J].Pattern Recognition,2002,35:2593-2611.
  • 7机检理论基础[EB/OL].http://libweb.zju.edu.cn/aduser/ service/lesson/Teach/SearchYan/CH1/Ch11.htm,2005.

二级参考文献14

  • 1[1]S. C. Hinds, J. L. Fisher and D. P. D' Amato, A Document Skew Detection Method Using Run - Length Encoding and the Hough Transform, ICPR'90:pp.464- 468.
  • 2[2]D. S. Le, G. R. Thoma and H. Wechsler, Automated Page Orientation and Skew Angle Detection for Binary Document Images, Pattern Recognition,Vol.27,No. 10,1997:pp. 1325- 1344.
  • 3[3]B.Yu and A.K.Jain,A Robust and Fast Skew Detection Algorithm for Generic Documents, Pattern Recognition, Vol. 29, No. 10, 1996: pp. 1599 -1629.
  • 4[4]H. Yan. Skew, Correction of Document Images Using Inerline Cross - correlation, Computer Vision Graphics Image Process: Graphical Models and Image Process, Vol.55, No.6,1993: pp.538 - 543.
  • 5[5]B.Gatos,N.Papamarkos and C. Chamzas, Skew Detection and Text Line Position Determination in Digitized Documents, Pattern Recognition, Vol. 30,No.9,1997:pp. 1505 - 1519.
  • 6[6]M. Chen and X. Ding, A Robust Skew Detection Algorithm for Grayscale Document Image, In Proceedings of ICDAR' 99, Sep.20 - 22, Bangalore, India:pp.617 - 620.
  • 7[7]G. Ciardiello, G. Scafur, M. T. Degrandi, M. R. Spada and M. P. Roccoteli,An Experimental System for Office Document Handling and Text Recognition. In Proceedings of Ninth International Conference on Pattern Recognition, 1998: pp. 739 - 743.
  • 8[8]T.Steiherz,N. Intrator and E. Rivlin, Skew Detection Via Principal Component Analysis, In Proceedings of ICDAR'99, Sep. 20 - 22, Bangalore, India:pp. 153 - 156.
  • 9[9]O. Okun, M. Pietikainen and J. Sauvola, Robust Skew Estimation on Low Resolutioon Document Images, In Proceedings of ICDAR' 99. Sep. 20 - 22.Bangalore, India: pp. 621 - 624.
  • 10[10]C. Sun and D. Si, Skew and Slant Correction for Document Image Using Gradient Direction, In Proceedings of the Fourth Intemational Conference on Document Analysis and Recognition, Ulm, Germany, August 18 - 20,1997:pp. 170 - 174.

共引文献21

同被引文献42

  • 1阳方林,杨风暴,韦全芳,韩焱.一种新的快速图像匹配算法[J].计算机工程与应用,2005,41(5):51-52. 被引量:13
  • 2罗钟铉,刘成明.灰度图像匹配的快速算法[J].计算机辅助设计与图形学学报,2005,17(5):966-970. 被引量:72
  • 3刘宝生,闫莉萍,周东华.几种经典相似性度量的比较研究[J].计算机应用研究,2006,23(11):1-3. 被引量:44
  • 4Rodtook S, Rangsanseri Y. Adaptive Thresholding of Document Images Based on Laplacian Sign[C]//Proceedings of Information Technology Conference on Coding and Computing. [S. l.]: IEEE Press, 2001: 501-505.
  • 5Shivakumara E Guru D S, Kumar H, et al. Skew Detection in Binary Document Image Using Linear Regression Analysis[C]//Proc. of NCACA'02. Tamil Nadu: [s. n.], 2002.
  • 6Meng Gaofeng, Zheng Nanning, Song Yonghong, et al. Document Images Retrieval Based on Multiple Features Combination[C]//Proc. of ICDAR'07. [S. l.]: IEEE Press, 2007:143-147.
  • 7ANAND KUMAR, C V JAWAHAR, R MARMATHA. Efficient search in document image collections [ C ]. ACCV, 2007, LNCS 4843:586 - 595.
  • 8JIE LUO, M A NASCIMENTO: Content- based sub- image retrieval using relevance feedback . Proc. of he 1st ACM Intl. Workshop on Muhimedia Databases,2004:2 -9.
  • 9Herrmann P,Schlageter G.Retrieval of document images using lay-out knowledge[C]//Proc 2nd ICDAR,1993:537-540.
  • 10Peng H,Long F,Chi Z,et al.Document image template matching based on component block list[J].Pattern Recognition Letters,2001,22(9):1033-1042.

引证文献6

二级引证文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部