摘要
提出一种基于颜色聚类和多帧融合的视频文字识别方法,首先,在视频文字检测模块,综合考虑了文字区域的两个显著特征:一致的颜色和密集的边缘,利用近邻传播聚类算法,根据图像中边缘颜色的复杂程度,自适应地把彩色边缘分解到若干边缘子图中去,使得在各个子图中检测文字区域更为准确.其次,在视频文字增强模块,基于文字笔画强度图过滤掉模糊的文字区域,并综合平均融合和最小值融合的优点,对在不同视频帧中检测到的、包含相同内容的文字区域进行融合,能够得到背景更为平滑、笔画更为清晰的文字区域图像.最后,在视频文字提取模块,通过自适应地选取具有较高文字对比度的颜色分量进行二值化,能够取得比现有方法更好的二值化结果;另一方面,基于图像中背景与文字的颜色差异,利用颜色聚类的方法去除噪声,能够有效地提高文字识别率.实验结果表明,该方法能够比现有方法取得更好的文字识别结果.
This paper proposes a new approach for the text recognition of video, whose novelty mainly lies in the color-based clustering and multiple frame integration of three phases: First, in the text detection phase, the two significant features of text block are jointly considered in a video: homogeneous color, dense edges, and color-based clustering are employed to decompose the color edge map of video frame into several edge maps, which make the text detection more accurate. Second, in text enhancement phase, the text blocks are identified and integrated with the same content by filtering the blurred text based on the proposed text-intensity map, which can obtain the clean background and clear text with a high contrast of effective text extraction. Third, in the text extraction phase, on one hand, for effective binarization of text block, instead of performing binarization in a constant color plane as in the existing methods, this approach can adaptively select the best color plane according to the text contrast difference among color planes for binarization. On the other hand, for effective text recognition, the color differences between the text and background in video frames are considred, and color-based clustering is utilized to remove the noises. Extensive experimental results have shown that this approach outperforms several existing state-of-theart methods.
出处
《软件学报》
EI
CSCD
北大核心
2011年第12期2919-2933,共15页
Journal of Software
基金
国家自然科学基金(60873154
61073084)
国家发改委资助项目([2010]3044)
关键词
视频文字识别
基于颜色的聚类
多帧融合
视频检索
噪声去除
text recognition of video
color-based clustering
multiple frame integration
video retrieval
noise removal