摘要
为了有效地对彩色文本图像进行分割 ,提出了一种复杂背景下彩色图像中文本 -背景分离的新方法。该方法首先应用颜色空间降维以及基于图理论的颜色聚类对彩色文本图像进行聚类 ,并对应于聚类结果获得一系列二值图像 ,这些二值图像以及它们之间的组合就构成了二值化的待选结果 ;然后对与游程直方图以及空间 -尺寸分布相关的两类纹理特征进行分析 ,并结合线性判别分析分类器来从待选的二值图像中选取出具有最佳文本 -背景分离效果的二值图像。实验结果显示 ,该方法的二值化效果比现有方法有显著提高 。
Text is an important feature for computer vision, especially for information retrieval applications. In this paper, the authors have developed a novel algorithm for text background separation, or binarization for color images of complicated backgrounds. In their algorithm, dimensionality reduction and graph theoretical clustering are first performed. Corresponding to each cluster, a binary image can be obtained. Additional binary images are obtained through combination among these cluster related binary images. Then, two kinds of features capable of effectively characterizing binary texture images, run length histogram based and spatial size distribution based features associated with each of these binary images are extracted out. Based on the analysis of these texture features, cooperating with an LDA classifier, the optimal binary image which gives the best text background separation will be found out as the final binarization result. Experiments with images collected from Internet have been carried out, which show that their method can handle color text images with complex background effectively; comparison with existing techniques also presented a notable improvement brought by the proposed method.
出处
《中国图象图形学报(A辑)》
CSCD
北大核心
2004年第3期290-296,共7页
Journal of Image and Graphics