摘要
MSER连通域在文字检测方法中被大量使用,主要用于生成最初的文字候选集合,便于后续算法进行筛选。而MSER连通域的筛选算法的效果直接影响了整体文字检测的最终结果。目前只有对整体文字检测进行评价的数据集和方法,缺少单纯地针对MSER筛选算法效果的评测机制。文中介绍了一个评测MSER候选集筛选算法效果的方法和框架。本方法首先应用基于森林的层次遍历算法遍历MSER候选集,然后随机着色同一层中的MSER候选集节点,挑选标记其中符合规范的区域,并标注结果为正样本,其余未标注部分为负样本,构建评测数据集。基于该数据集,可以评测各类筛选算法的性能。文中测试了MSER剪枝算法、深度学习和SVM分类方法来对候选集进行筛选并评测其效果,表明了评测方法和数据集的有效性。
MSER based method is widely used in text detection, and has reported promising performance. MSER refers to primary candidates regions of text detection. How to screen the right candidates is the key process in text detection. For now,all studies are about the ways to evaluate the performance of text detection. There is no study about evaluating the performance of screen algorithms.In this paper,a method is proposed to evaluate the performance of MSER candidates screen algorithm.Firstly traverse the MSER tree level by level to get candidates,then color the candidates randomly. After that,the colored candidates are drawn in a picture,and the right candidates are chosen and recorded as label 1,and the untagged candidates are recorded as label 0,both of them compose the evaluation dataset. With the dataset,the performance of all screen algorithms can be evaluated. The pruning algorithm,deep learning and SVM algorithm are used to screen MSER candidates,and the performances of the algorithms are compared based on the dataset. With the result of comparison,the proper algorithm can be used to make text detection more effective.
出处
《信息技术》
2018年第1期100-104,109,共6页
Information Technology