摘要
本文提出了一种基于投影分析与支持向量机学习相结合的文本定位方法。首先采用投影分析的方法将可能的文本区域提取出来,然后再采用基于支持向量机学习的方法将提取出来的文本区域中的虚假文本区域排除掉。采用投影分析的方法时先将图像的边缘提取出来,再使用一些形态学的操作使边缘聚集,最后采用多次投影定位出文本区域。在使用支持向量机进行文本分类时本文采用了小波,角点,扫描线和区域内边缘点的重心位置等特征。实验表明该方法比单纯的基于边缘的方法要好。
A novel approach to text localization is presented in this paper. We combine projection analysis of edge based method and learning of support vector machine based method in text localization. The text localization can be divided into two steps. In the first step, the potentially text area are extracted by the edge method. In the second step, we use support vector machine to classify the actual text areas and the false text areas. The false text areas are removed in this step. The edge of the image is extracted first in the edge method. Then the morphologic operation is used. The text areas are localized by multiply horizontal and vertical projection. The textures we used in the support vector machine are wavelet, comer, line and the center of gravity of the text areas. It shows a good result in the experiment compare to the simple edge based method.
出处
《科技信息》
2011年第27期I0040-I0042,I0083,共4页
Science & Technology Information
关键词
边缘
文本定位
支持向量机
多次投影分析
Edge
Text localization
Support vector machine
Projection analysis