This paper proposes a new two-phase approach to robust text detection by integrating the visual appearance and the geometric reasoning rules. In the first phase, geometric rules are used to achieve a higher recall rat...This paper proposes a new two-phase approach to robust text detection by integrating the visual appearance and the geometric reasoning rules. In the first phase, geometric rules are used to achieve a higher recall rate. Specifically, a robust stroke width transform(RSWT) feature is proposed to better recover the stroke width by additionally considering the cross of two strokes and the continuousness of the letter border. In the second phase, a classification scheme based on visual appearance features is used to reject the false alarms while keeping the recall rate. To learn a better classifier from multiple visual appearance features, a novel classification method called double soft multiple kernel learning(DS-MKL) is proposed. DS-MKL is motivated by a novel kernel margin perspective for multiple kernel learning and can effectively suppress the influence of noisy base kernels. Comprehensive experiments on the benchmark ICDAR2005 competition dataset demonstrate the effectiveness of the proposed two-phase text detection approach over the state-of-the-art approaches by a performance gain up to 4.4% in terms of F-measure.展开更多
基金supported by National Natural Science Foundation of China(Nos.61300163,61125106 and 61300162)Jiangsu Key Laboratory of Big Data Analysis Technology
文摘This paper proposes a new two-phase approach to robust text detection by integrating the visual appearance and the geometric reasoning rules. In the first phase, geometric rules are used to achieve a higher recall rate. Specifically, a robust stroke width transform(RSWT) feature is proposed to better recover the stroke width by additionally considering the cross of two strokes and the continuousness of the letter border. In the second phase, a classification scheme based on visual appearance features is used to reject the false alarms while keeping the recall rate. To learn a better classifier from multiple visual appearance features, a novel classification method called double soft multiple kernel learning(DS-MKL) is proposed. DS-MKL is motivated by a novel kernel margin perspective for multiple kernel learning and can effectively suppress the influence of noisy base kernels. Comprehensive experiments on the benchmark ICDAR2005 competition dataset demonstrate the effectiveness of the proposed two-phase text detection approach over the state-of-the-art approaches by a performance gain up to 4.4% in terms of F-measure.