摘要
提出的一种以连通元件为基础的文字撷取方法,可自动侦测分割在自然场景中的文字区块。首先通过快速的连通元件算法去产生各个连通元件,接着利用几何特微过滤掉大量的连通元件,并对剩下的连通元件进行小波与文理特微的撷取,将其撷取的特微送到Adaboost作为其输入,然后利用Adaboost算法所训练出来的强分类器去进行判断,快速地将图像中非文件的连通元件给滤除,当连通元件区块通过此强分类器时,其连通元件将被视为文字元件且被当作为最后撷取的结果。结合离散小波与文理特微,使得在文字定位的撷取正确率可达95.56%以上;此外通过Adaboost算法的快速收敛,还可以减少计算成本的浪费。
In this paper,a text extraction method based on connected components is proposed to automatically detect and segment text blocks in natural scenes.Firstly,each connected component is generated by a fast connected component algorithm.Then,a large number of connected components are filtered out by geometric microfilter.Then,the remaining connected components are extracted by wavelet and Villit microfilter,and the extracted microparticles are sent to Adaboost as their input.Then,the strong classifier trained by Adaboost algorithm is used to judge the non-files in the image quickly.Connecting elements are filtered out.When a connected component block passes through this strong classifier,its connected elements are treated as text elements and as the result of final retrieval.This research combines discrete wavelet and Wenliwei to make the accuracy of text location to more than 95.56%.In addition,the fast convergence of Adaboost algorithm can reduce the waste of computing cost.
作者
郑建云
刘军华
雷超阳
ZHENG Jianyun;LIU Junhua;LEI Chaoyang(Hunan Post and Telecommunication Vocational College,Changsha 410015,China)
出处
《钦州学院学报》
2019年第1期44-49,共6页
Journal of Qinzhou University