摘要
针对扫描背景不定且含有图表信息的复杂文本图像,提出了一种有效的倾斜检测方法。该方法首先通过对梯度图像的统计分析,自适应地选取到了包含文字的特征子区;在特征子区内,论文把文字行间的空白条带看作一条隐含的线,用优化理论计算出空白条带的倾斜角度,这也就是文本的倾斜角度。实验结果表明,该倾斜检测方法不受扫描背景、边界大小、文本布局及行间距等情况的限制,具有速度快、精度高、适应性强的特点。
An efficient skew detection algorithm was proposed for complex document image containing figure or form contents. The algorithm included three steps: Firstly, the text sub-region was selected adaptively according to the feature that the edges contained in text regions was stronger than those in non-text regions, Secondly, the blank bars between two text lines were extracted by blank blocks searching; Thirdly, the skew angle of blank bar was calculated by directional fitting, and this skew angle was just the document skew angle. The experiment results show that this algorithm is insensitive to the circumstances such as scan background, document layout and line spacing, and the skew angle can be detected rapidly and accurately.
出处
《计算机应用》
CSCD
北大核心
2006年第7期1587-1589,1597,共4页
journal of Computer Applications
关键词
复杂文本图像
倾斜检测
特征子区域
统计分析
complex document image
skew detection
feature sub-region
statistics analysis