期刊文献+

一种语义级文本协同图像识别方法 被引量:7

A collaborative image recognition method based on semantic level of text
下载PDF
导出
摘要 为解决单纯依赖图像低级视觉模态信息进行图像识别准率低的问题.考虑到许多图像中存在文本信息,提出了利用图像中的文本信息辅助图像识别的语义级文本协同图像识别方法.该方法通过文本定位方法定位到图像中的文本块,对其进行分割、二值化、提取特征等处理;然后获取语义,提取图像底层视觉信息,计算两模态的相关性,从而得到协同后验概率;最后,得到联合后验概率,并取其中最大联合后验概率对图像进行识别.在自建体育视频帧数据库中,通过与以朴素贝叶斯为代表的单模态方法进行比较,方法在3种不同视觉特征下均具有更高的准确率.实验结果表明,文本协同方法能够有效辅助图像识别,具有更好的识别性能. To solve the problem that singular-modal image recognition using only the low-level visual features has low accuracy, considering that many images have embedded-in textual information, a collaborative method using the embedded-in text to aid the recognition of images is proposed. The method includes three steps. Firstly, after localization, segmentation, binarization and feature extraction, semantics of text is gotten. Secondly, the collaborative posterior probability is calculated by extracting visual features of images and counting correlation of visual and textual modals. At last, for each class of images, the joint posterior probability is calculated using the previous two items. A new image is recognized to the class with maximal joint posterior probability. Experiments on the self-built data set of sports video frames showed that the proposed method performed better than the singular-modal method on three different visual features and had higher accuracy.
出处 《哈尔滨工业大学学报》 EI CAS CSCD 北大核心 2014年第3期49-53,共5页 Journal of Harbin Institute of Technology
基金 国家自然科学基金资助项目(61173087 41071262)
关键词 文本定位 图像识别 多模态 text localization image recognition multi-modal
  • 相关文献

参考文献14

  • 1PANDA N, CHANG E Y. Efficient top-k hyperplanequery processing for multimedia information retrieval[C]//Proceedings of the 14th annual ACMinternational conference on Multimedia. New York, NY:ACM, 2006: 317-326.
  • 2LU Zhiwu, IP H H S. Image categorization with spatialmismatch kernels[C]//IEEE Conference on ComputerVision and Pattern Recognition. Miami, FL: IEEE,2009: 397-404.
  • 3SONG X,JIAO L C,YANG S,et al. Sparse coding andclassifier ensemble based multi-instance learning forimage categorization [J]. Signal Processing, 2013,93(1) :1-11.
  • 4RUSSELL B C, FREEMAN W T, EFROS A A, et al.Using multiple segmentations to discover objects andtheir extent in image collections [C] //IEEE Conferenceon Computer Vision and Pattern Recognition.Piscataway, NJ; IEEE, 2006: 1605-1614.
  • 5VAILAYA A,FIGUEIREDO MAT,JAIN A K, et al.Image classification for content-based indexing [J].IEEE Transactions on Image Processing, 2001, 10( 1):117-130.
  • 6LI F F,PERONA P. A bayesian hierarchical model forlearning natural scene categories [C] //IEEE ComputerSociety Conference on Computer Vision and PatternRecognition. Piscataway, NJ: IEEE, 2005: 524-531.
  • 7LIU D,CHEN T. Unsupervised image categorization andobject localization using topic models andcorrespondences between images [C] //InternationalConference on Computer Vision. Piscataway. NJ : IEEE,2007: 1-7.
  • 8FERGUS R, PERONA P, ZISSERMAN A. Object classrecognition by unsupervised scale-invariant leaming[C]//IEEE Conference on Computer Vision and PatternRecognition. Piscataway,NJ; IEEE, 2003 : 264-271.
  • 9LIU Y,GOTO S,IKENAGA T. A robust algorithm for textdetection in color images [C] //Proceedings of the EighthInternational Conference on Document Analysis andRecognition. Piscataway, NJ; IEEE, 2005 ; 399-403.
  • 10CHEN Y,WANG J Z. Image categorization by learningand reasoning with regions [J]. The Journal of MachineLearning Research,2004,5( 12) ; 913-939.

二级参考文献25

  • 1Li XR,Chen L,Zhang L,Lin FZ,Ma WY.Image annotation by large-scale content-based image retrieval.In:Nahrstedt K,et al.,ed.Proc.of the 14th ACM Int'l Conf.on Multimedia.Santa Barbara:ACM Press,2006.607-610.
  • 2Wang XJ,Zhang L,Jing F,Ma WY.AnnoSearch:Image auto-annotation by search.In:Hari S,Milind RN,John RS,Yong R,eds.Proc.of the Conf.Image and Video Retrieval.2006.1483-1490.
  • 3Feng HM,Shi R,Chua TS.A bootstrapping framework for annotating and retrieving WEB images.In:Schulzrinne H,et al.,eds.Proc.of the 12th ACM Int'l Conf.on Multimedia.New York:ACM Press,2004.960-967.
  • 4Tseng VS,Su JH,Wang BW,Lin YM.WEB image annotation by fusing visual features and textual information.In:Proc.of the 2007 ACM Symp.on Applied Computing,Symposium on Applied Computing.New York:ACM Press,2007.1056-1060.
  • 5Mori Y,Takahashi H,Oka R.Image-to-word transformation based on dividing and vector quantizing images with words.In:Proc.of the 1st Int'l Workshop on Multimedia Intelligent Storage and Retrieval Management.1999.
  • 6Duygulu P,Barnard K,de Freitas JFG,Forsyth DA.Object recognition as machine translation:Learning a lexicon for a fixed image vocabulary.In:Proc.of the European Conf.on Computer Vision.2002.97-112.
  • 7Blei D,Jordan M.Modeling annotated data.In:Proc.of the Int'l ACM SIGIR.Toronto:ACM Press,2003.127-134.
  • 8Jeon J,Lavrenko V,Manmatha R.Automatic image annotation and retrieval using cross-media relevance models.In:Proc.of the Int'l ACM SIGIR.Toronto:ACM Press,2003.119-126.
  • 9Li J,Wang J.Automatic linguistic indexing of pictures by a statistical modeling approach.IEEE Trans.on Pattern Analysis and Machine Intelligence,2003,25(19):1075-1088.
  • 10E.Chang,G.Kingshy,G.Sychay,and G.Wu.Cbsa:Content-Based soft annotation for multimodal image retrieval using Bayes point machines.IEEE Trans.on CSVT,2003,13(1):26-38.

共引文献14

同被引文献31

引证文献7

二级引证文献23

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部