期刊文献+

基于颜色聚类和多帧融合的视频文字识别方法 被引量:22

Recognition of Text in Video Based on Color Clustering and Multiple Frame Integration
下载PDF
导出
摘要 提出一种基于颜色聚类和多帧融合的视频文字识别方法,首先,在视频文字检测模块,综合考虑了文字区域的两个显著特征:一致的颜色和密集的边缘,利用近邻传播聚类算法,根据图像中边缘颜色的复杂程度,自适应地把彩色边缘分解到若干边缘子图中去,使得在各个子图中检测文字区域更为准确.其次,在视频文字增强模块,基于文字笔画强度图过滤掉模糊的文字区域,并综合平均融合和最小值融合的优点,对在不同视频帧中检测到的、包含相同内容的文字区域进行融合,能够得到背景更为平滑、笔画更为清晰的文字区域图像.最后,在视频文字提取模块,通过自适应地选取具有较高文字对比度的颜色分量进行二值化,能够取得比现有方法更好的二值化结果;另一方面,基于图像中背景与文字的颜色差异,利用颜色聚类的方法去除噪声,能够有效地提高文字识别率.实验结果表明,该方法能够比现有方法取得更好的文字识别结果. This paper proposes a new approach for the text recognition of video, whose novelty mainly lies in the color-based clustering and multiple frame integration of three phases: First, in the text detection phase, the two significant features of text block are jointly considered in a video: homogeneous color, dense edges, and color-based clustering are employed to decompose the color edge map of video frame into several edge maps, which make the text detection more accurate. Second, in text enhancement phase, the text blocks are identified and integrated with the same content by filtering the blurred text based on the proposed text-intensity map, which can obtain the clean background and clear text with a high contrast of effective text extraction. Third, in the text extraction phase, on one hand, for effective binarization of text block, instead of performing binarization in a constant color plane as in the existing methods, this approach can adaptively select the best color plane according to the text contrast difference among color planes for binarization. On the other hand, for effective text recognition, the color differences between the text and background in video frames are considred, and color-based clustering is utilized to remove the noises. Extensive experimental results have shown that this approach outperforms several existing state-of-theart methods.
出处 《软件学报》 EI CSCD 北大核心 2011年第12期2919-2933,共15页 Journal of Software
基金 国家自然科学基金(60873154 61073084) 国家发改委资助项目([2010]3044)
关键词 视频文字识别 基于颜色的聚类 多帧融合 视频检索 噪声去除 text recognition of video color-based clustering multiple frame integration video retrieval noise removal
  • 相关文献

参考文献2

二级参考文献26

  • 1[1]Y Wang, Z Liu, J Huang. Multimedia content analysis using audio and visual information[J]. IEEE Signal Processing Magazine, 2000, 17(6):12~36
  • 2[2]R Lienhart, F Stuber. Automatic text recognition in digital videos[A]. In: Proceedings of ACM Multimedia, Boston, 1996.11~20
  • 3[3]Zhong Yu, Zhang Hongjiang, Jain Anil K. Automatic caption localization in compressed video[J]. Pattern Analysis and Machine Intelligence, 2000, 22(4):385~392
  • 4[4]V Vapnik. The Nature of Statistical Learning Theory[M]. New York: Springer, 1995
  • 5[5]M Schmidt. Identifying speaker with support vector networks[A]. In: Proceedings of Interface'96, Sydney, 1996
  • 6[6]T Joachims. Text categorization with support vector machines: Learning with many relevant features[A]. In: Proceedings of the 10th European Conference on Machine Learning, Chemnitz, Germany, 1998.137~142
  • 7[7]Yuan Qi. Learning algorithms for video and audio processing: Independent component analysis and support vector machine based approaches[R].College Park: University of Maryland at College Park, LAMP-TR-056(CAR-TR-951), 2000
  • 8[8]Edgar Osuna, Robert Freund, Federico Girosi. Training support vector machines: An application to face detection[A]. In: Proceedings of Computer Vision and Pattern Recognition, Puerto Rico, 1997.130~136
  • 9[9]C J C Burges. A tutorial on support vector machines for pattern recognition[J]. Data Mining, and Knowledge Discovery, 1998, 2(2):121~167
  • 10[10]T M Cover. Geometrical and statistical properties of systems and linear inequalities with applications in pattern recognition[J]. IEEE Transactions on Electronic Computers, 1965, 14(3):326~334

共引文献46

同被引文献186

引证文献22

二级引证文献90

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部