期刊文献+

基于有向单连通链的表格框线检测算法 被引量:23

A Form Frame-Line Detection Algorithm Based on Directional Single-Connected Chain
下载PDF
导出
摘要 表格框线检测是表格识别的基础.现有的表格框线检测算法或者速度慢,或者鲁棒性差,而且没有充分利用表格框线之间的约束信息提出了一种基于所定义的图像结构基元“有向单连通链”的自底向上表格框线检测算法.在此算法中,有向单连通链是一种黑像素游程序列,作为非常合适的矢量基元,在引入一定表格框线约束信息的条件下合并单连通链,有效地去除伪框线,补全断裂的框线,提高了算法的鲁棒性,可以准确而快速地提取表格框线.通过滤除噪声单连通链,加快单连通链的合并速度,算法速度提高了3~10倍,满足了实用要求、实验证明,该算法具有速度较快、鲁棒性高、抗任意角度的倾斜、抗断裂等优点. The existing form frame line detection algorithms are either time consuming or with low robustness. Furthermore, all these approaches do not use the constraint information between form frame lines. In this paper, a novel bottom-up form frame line detection algorithm is proposed based on the directional single-connected chain (DSCC). Defined as an array of black pixel run-lengths, DSCC works very well as an image structure element or a vector in this vectorization algorithm. By merging multiple DSCCs under some constraints, people are able to extract the form frame lines automatically yet fast. With the help of the constraints between form frame lines, the robustness of the approach is increased drastically by getting rid of pseudo lines and completing broken lines. By filtering DSCCs created by noise and speeding up the merging of DSCCs, the speed of this algorithm is comparable with the well-known projection method. Experimental results show that this algorithm is fast, resistant to moderate serious line break and skew of any angle.
出处 《软件学报》 EI CSCD 北大核心 2002年第4期790-796,共7页 Journal of Software
基金 国家自然科学基金资助项目(69972024) 863高科技发展计划基金资助项目(863-306-ZT03-03-1)
关键词 表格识别 图像分析 光学字符识别 智能文档处理 表格框线检测算法 有向单连通链 form recognition image analysis line detection optical character recognition (OCR) intelligent document processing
  • 相关文献

参考文献7

  • 1Illingworth,J.,Kittler,J.A survey of the hough transform.Computer Vision,Graphics,and Image Processing,1988,44(1):87~116.
  • 2Liu,J.H.,Ding,X.Q.,Wu,Y.S.,et al.Description and recognition of form and automated form data entry.In: Proceedings of the 3th International Conference on Document Analysis and Recognition.Montreal,Canada,1995.579~582.
  • 3Liu,W.Y.,Dov,D.From raster to vectors: extracting visual information from line drawings.Pattern Analysis and Application,1999,2(1):10~21.
  • 4Yu,B.,Jain,A.K.A generic system for form dropout.IEEE Transactions on Pattern Analysis and Machine Intelligence,1996,18(11):1127~1131.
  • 5Pan,S.Y.Research and realization of a generic form recognition system [MS.Thesis].Beijing: Tsinghua University,1999 (in Chinese).
  • 6Chen,J.-L.,Lee,H.-J.An efficient algorithm for form structure extraction using strip projection.Pattern Recognition,1998,31(9):1353~1368.
  • 7潘世言.通用表格识别系统的研究与实现[硕士学位论文].北京:清华大学,1999.

同被引文献151

引证文献23

二级引证文献68

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部