摘要
在版面分析过程中,有时会将表格误判为图形或将图形误判为表格。为避免对误判的表格或图形进行识别而产生的错误结果,文章提出了一种根据表格框线信息和表格单元信息来区分表格与图形的方法。该方法结合表格的结构特征,提出了作为一个表格的重要组成要素的表格框线和表格单元所必须满足的若干约束条件,通过验证每个条件是否得到满足来区分表格与图形。实验表明,该方法能有效地区分绝大多数表格与图形,极大地降低了对表格与图形的误判率。
Tables may be treated as graphics,and graphics may be treated as tables by mistake of layout analysis.In order to avoid this kind of error,this paper presents a method to distinguish tables from graphics based on the struc-tural constrained information of table frame lines and cells.According to the structure of a table,some necessary restric-tions that must be satisfied by all frame lines and cells in a table are presented in this paper.And we verify all these restrictions to distinguish tables from graphics.Experiments show that this method is effective.
出处
《计算机工程与应用》
CSCD
北大核心
2004年第12期83-87,共5页
Computer Engineering and Applications
基金
国家自然科学基金资助(编号:60241005)
国家863高技术研究发展计划资助(编号:2001AA114081)