We present a robust connected-component (CC) based method for automatic detection and segmentation of text in real-scene images. This technique can be applied in robot vision, sign recognition, meeting processing and ...We present a robust connected-component (CC) based method for automatic detection and segmentation of text in real-scene images. This technique can be applied in robot vision, sign recognition, meeting processing and video indexing. First, a Non-Linear Niblack method (NLNiblack) is proposed to decompose the image into candidate CCs. Then, all these CCs are fed into a cascade of classifiers trained by Adaboost algorithm. Each classifier in the cascade responds to one feature of the CC. Proposed here are 12 novel features which are insensitive to noise, scale, text orientation and text language. The classifier cascade allows non-text CCs of the image to be rapidly discarded while more computation is spent on promising text-like CCs. The CCs passing through the cascade are considered as text components and are used to form the segmentation result. A prototype system was built, with experimental results proving the effectiveness and efficiency of the proposed method.展开更多
This paper proposes a learning-based method for text detection and text segmentation in natural scene images. First, the input image is decomposed into multiple connected-components (CCs) by Niblack clustering algorit...This paper proposes a learning-based method for text detection and text segmentation in natural scene images. First, the input image is decomposed into multiple connected-components (CCs) by Niblack clustering algorithm. Then all the CCs including text CCs and non-text CCs are verified on their text features by a 2-stage classification module, where most non-text CCs are discarded by an attentional cascade classifier and remaining CCs are further verified by an SVM. All the accepted CCs are output to result in text only binary image. Experiments with many images in different scenes showed satisfactory performance of our proposed method.展开更多
文摘We present a robust connected-component (CC) based method for automatic detection and segmentation of text in real-scene images. This technique can be applied in robot vision, sign recognition, meeting processing and video indexing. First, a Non-Linear Niblack method (NLNiblack) is proposed to decompose the image into candidate CCs. Then, all these CCs are fed into a cascade of classifiers trained by Adaboost algorithm. Each classifier in the cascade responds to one feature of the CC. Proposed here are 12 novel features which are insensitive to noise, scale, text orientation and text language. The classifier cascade allows non-text CCs of the image to be rapidly discarded while more computation is spent on promising text-like CCs. The CCs passing through the cascade are considered as text components and are used to form the segmentation result. A prototype system was built, with experimental results proving the effectiveness and efficiency of the proposed method.
基金Project supported by the OMRON and SJTU Collaborative Founda-tion under PVS project (2005.03~2005.10)
文摘This paper proposes a learning-based method for text detection and text segmentation in natural scene images. First, the input image is decomposed into multiple connected-components (CCs) by Niblack clustering algorithm. Then all the CCs including text CCs and non-text CCs are verified on their text features by a 2-stage classification module, where most non-text CCs are discarded by an attentional cascade classifier and remaining CCs are further verified by an SVM. All the accepted CCs are output to result in text only binary image. Experiments with many images in different scenes showed satisfactory performance of our proposed method.