摘要
根据表格图像直线交点特征以及表格中标题域与数据域的依赖关系,将表格的布局划分为6种基本结构,并以此提出了点的极大从属区域(MAZ)的定义.在此基础上,提出了一种基于MAZ的表格图像分割算法.该算法不仅能够实现对已填充表格的逻辑结构分析,而且可以按照基本的布局结构进行分割,将相互依赖的单元格划分在同一个子表中.实验结果证明了文中方法的有效性.
According to the features of line intersection and the dependent relations between the name fields and data fields, the layout structure is classified into six classes, and then a definition of the Maximum Attributive Zone(MAZ) is presented. Based on the MAZ, an algorithm for the table-form image segmentation is presented. Not only can this method analyse the filled-in table-form image's logical layout, but also it may divide the interdependent units into the same sub-tables according to the basic layout fragments. The experimental results show that this algorithm is effective.
出处
《西安电子科技大学学报》
EI
CAS
CSCD
北大核心
2008年第2期293-296,共4页
Journal of Xidian University
基金
西安市科技兴贸计划(ZX04064)
关键词
直线交点特征
基本布局结构
表格图像分割
line intersection features
base layout structure
table image segmentation