摘要
图像中的文本检测是基于内容的检索的重要组成部分。然而在自然场景中,由于背景的复杂性,文字大小的不确定性,易受光照及污点污染等因素使得检测工作较为困难。在对图像提取最大稳定极值区的基础上,利用同一区域文字的纹理特征一致性来提取文本候选区,然后根据中文文字的笔画结构特点提出新的规则度特征,并结合一般性特征利用Adaboost算法进行对连通分量的分类。通过对最终的连通分量进行基于汉字结构的合并,并根据文本排布的启发式规则实现精确定位。仿真实验表明,新算法取得良好的效果。
Text detection in images is an important step of the content based retrieval. Due to the complexity of the background, the uncertainty of text size and vulnerability to light and stain pollution , the detection gets difficuh for the natural scene. In this paper , candidate regions of text is extracted by using texture consistent features of the same text area on the foundation of maximally stable extremal regions(MSER) , and then a new feature of regulation based on the features of Chinese characters' strokes is proposed to classify the connected components ,by means of Adaboost algorithm combined with other general features.Text can he located precisely according to the merger of conneeted components based on the structure of Chinese characters and the heuristic rules of text arrangement. Simulation shows that the new algorithm performances well for the natural scene images.
出处
《电视技术》
北大核心
2015年第21期1-4,14,共5页
Video Engineering
基金
国家自然科学基金项目(61305122)