摘要
提出一种从自然场景中提取文本区域的方法。该方法包括候选文本区域的提取,以及候选区域是否为文字区域的判定。候选文字区域的提取,主要利用图像的纹理特征和HSL颜色空间信息,通过改进的模糊C均值聚类函数,结合拉普拉斯掩膜与计算最大梯度差来实现。由连通域边缘密度信息、形状信息的马尔科夫随机场模型,判定候选文字区域是否为文字区域。经ICDAR2003数据库测试结果表明,该方法具有较高的精确度。
This paper proposes a method for extracting text regions from natural scene images.This method includes two parts,text region candidates extraction and candidate regions further classification of text region or non-text region.The text region candidates are extracted through a modified fuzzy C-means clustering algorithm combined with Laplacian mask and maximum gradient difference value,which involves texture features and HSL color space information.The candidate regions are checked by edge density information and shape information of the connected components based on Markov Random Field(MRF) model.The proposed method achieves reasonable accuracy for text extraction from examples of the ICDAR 2003 database.
出处
《计算机工程》
CAS
CSCD
北大核心
2011年第21期176-178,181,共4页
Computer Engineering