摘要
针对自然场景下多方向、多语种文本区域提出了一种新的检测方法.该方法首先使用提出的边界提升最大稳定极值区域(MSER)算法,检测得到相比传统MSER算法更少的文本候选区域;然后依据设计的含有多个分类器的字符分拣树对候选区域进行层层分拣,剔除绝大部分非字符区域;接着使用提出的多层融合的聚合算法逐层对候选字符进行融合生成文本行;最后采用随机森林分类器对文本行进行验证.通过在ICDAR2003和MSRA-TD500数据集上的测试与比较,结果证明该方法在综合性能上要优于目前常用方法.
A new approach for multi-directional and lingual text location was presented.First,a pro-posed method named edge-enhanced maximum stable extremal regions (MSER)algorithm was used to obtain less candidate regions compared to the original MSER algorithm.Then,a character selecting tree,which contained multiple classifiers,was designed to eliminate the vast majority of non-character regions layer by layer,and the character regions were grouped into text candidates by the proposed multilayer fusion algorithm.Finally,random forest classifier was used to verify text candidates.The proposed system was evaluated on the ICDAR2003 and MSRA-TD500 dataset,and the results show that the method′s comprehensive performance is better than the currently used methods.
出处
《华中科技大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2015年第S1期228-232,共5页
Journal of Huazhong University of Science and Technology(Natural Science Edition)
基金
国家自然科学基金资助项目(61374038)
江苏省自然科学基金资助项目(BK20130639)
江苏高校优势学科建设工程资助项目
关键词
文本检测
极值区域
特征提取
字符聚合
随机森林
text detection
extremal region
feature extraction
character grouping
random forest