摘要
在文本扫描输入的过程中,文本图像不可避免地会发生倾斜,而布局分析及字符识别算法对页面倾斜十分敏感,因此倾斜检测和校正是文档分析预处理中的重要环节。提出了一个基于直线拟合的倾斜检测方法,它对文本图像二值化、分块,进行Fourier变换获得Fourier光谱,提取Fourier光谱中反映倾斜角的特征点,然后对特征点进行拟合处理,最后获得页面倾斜角。实验结果表明,该方法能够精确检测文本的倾斜角度,并且不受文本布局、行间距以及字体的影响。
In process of document scanning input, document images inevitably introduce skew. While algorithms for layout and character recognition are every sensitive to page skew, so the detecting and the correcting of skew angle is an important step for the DIA pre-processing. In this paper, a skew detection method based on straight-line fitting is provided. After binarizing, sub-dividing and Fourier transforming, the feature points are extracted from the spectrums and then calculated by the way of straight-line fitting, at last, the skew can be obtained. Experiment results show this method can achieve preferable proces ̄sing results and is insensitive to document layout, line spacing and the script.
出处
《计算机应用研究》
CSCD
北大核心
2005年第6期251-253,共3页
Application Research of Computers