期刊文献+

融合空间信息LDA的视觉对象识别研究

LDA Model Combined Spatial Information for Visual Object Recognition Research
下载PDF
导出
摘要 近年来国内外有很多学者将广泛用于自然语言处理的LDA模型引入视觉对象识别,对象分割,场景分类等应用中。LDA模型是产生式模型,所以必然存在产生式模型共有的弊端,即假设每个视觉词汇所对应主题的产生是条件独立的。根据图像本身的特征,图像的空间信息对图像物体识别起了很大的作用,一个视觉词汇主题的生成过程是受其相邻视觉词汇主题所影响的,为了提高图像视觉词汇的主题分配正确率,提出融合空间信息LDA模型,即融合条件随机场的LDA模型,从而在图像的局部主题标签上融合二维图像空间信息,既避免空间信息的丢失,同时可以提高其视觉词汇的主题分配正确率。论文主要研究内容:首先对LDA模型进行改进,并在该模型中引入条件随机场,同时推导出使用期望最大化算法确定的模型参数。该论文提出使用条件随机场获得图像的二维空间信息,将产生式模型和判别式模型融合,在增强由图像本身性质所决定的相邻区域主体标签的空间关联性的同时,也提高了视觉对象识别的精确度,完成图像的自动标注。 In recently years, many scholars introduce the LDA model which is widely used in nature language processing into visual object recognition, object segmentation, scene classifieat/on and so on. LDA model is a novel generative model, so there must be common defect between generative models that it assumes latent topic assignments of different visual words are conditionally independent, According to the eharacteristics Of images, spatial information of the images plays an important role in image object recognition, that is to say, the generation process of the latent topics given the visual words is influenced by its adjacent visual words' latent topics. In order to improve the accuracy of the distribution of the topics given the visual words, the paper proposes the LDA model combined spatial information, namely LDA model combined CRF, which is fused the 2D image spatial information in the local latent topic label to avoid the Ioss of spatial information and can improve the ,accuracy of the distribution of the latent topics. The main research Contents of this paper: firstly, improve the LDA model, and combine the conditional random field into LDA model, and derive the model parameters using the corresponding EM algorithm. This paper uses the conditional random fields for getting 2D spatial information of the images ; combines the generative model and the discriminative model. The paper enhances the spatial correlation of the latent topic labels of the adjacent visual words determined by the images' nature characteristic, at the same time, improves the recognition rate of the visual objects.
出处 《智能计算机与应用》 2013年第4期29-33,38,共6页 Intelligent Computer and Applications
基金 国家自然科学基金(61171185 61271346 60932008) 高等学校博士学科点专项科研基金(20112302110040)
关键词 视觉对象识别 LDA模型 空间信息 条件随机场 期望最大化算法 Visual Object Recognition LDA Model Spatial Information CRF EM Algorithm
  • 相关文献

参考文献17

  • 1BOSCH A,ZISSERMAN A,MUNOZ X. Scene classification using a hybrid generative/discriminative approach[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2008,(04):712-727.doi:10.1109/TPAMI.2007.70716.
  • 2VIJAYANARASIMHAN S,GRAUMAN K. Keywords to visual categories:Multiple-instance learning for weakly supervised object categorization[A].2008.1-8.
  • 3CANINI K R,SHASHKOV M M,GRIFFITHS T L. Modeling transfer learning in human categorization with the hierarchical Dirichlet process[A].2010.
  • 4BERGAMO A,TORRESANI L. Exploiting weakly-labeled Web images to improve object classification:a domain adaptation approach[A].2010.
  • 5LOWED G. Distinctive image features from scale-invariant keypoints[J].International Journal of Computer Vision,2004,(02):91-110.doi:10.1023/B:VISI.0000029664.99615.94.
  • 6JURIE F,TRIGGS B. Creating efficient codebooks for visual recognition[A].2005.604-610.
  • 7STEYVERS M,GRIFFITHS T. Probabilistic topic models[A].2007.
  • 8BLEI D M,NG A Y,JORDAN M I. Latent Dirichlet allocation[J].Journal of Machine Learning Research,2003.993-1022.
  • 9李志欣,施智平,刘曦,史忠植.建模连续视觉特征的图像语义标注方法[J].计算机辅助设计与图形学学报,2010,22(8):1412-1420. 被引量:9
  • 10吴飞,韩亚洪,庄越挺,邵健.图像-文本相关性挖掘的Web图像聚类方法[J].软件学报,2010,21(7):1561-1575. 被引量:10

二级参考文献24

  • 1Huang J, Kumar S R, Mitra M, et al. Spatial color indexing and applications [J]. International Journal of Computer Vision, 1999, 35(3): 245-268.
  • 2Manjunath B S, Ma W Y. Texture features for browsing and retrieval of image data [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1996, 18(8): 837-842.
  • 3Smeulders A W M, Worring M, Santini S, et al. Content-based image retrieval at the end of the early years [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(12): 1349-1380.
  • 4Datta R, Joshi D, Li J, et al. Image retrieval: ideas, influences, and trends of the new age [J]. ACM Computing Surveys, 2008, 40(2): Article No 5.
  • 5Hofmann T. Unsupervised learning by probabilistic latent semantic analysis [J]. Machine I.earning, 2001, 42 (1/2) : 177-196.
  • 6Blei D M, Ng A Y, Jordan M I. Latent Dirichlet allocation [J]. Journal of Machine Learning Research, 2003, 3 (1):993-1022.
  • 7Blei D M, Jordan M I. Modeling annotated data [C] // Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM Press, 2003:127-134.
  • 8Monay F, Gatica-Perez D. Modeling semantic aspects for cross-media image indexing [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29 (10) : 1802-1817.
  • 9Li Z X, Shi Z P, Li Z Q, et al. Modeling latent aspects for automatic image annotation [C]//Proceedings of the 16th IEEE International Conference on Image Processing. Los Alamitos: IEEE Computer Society Press, 2009:; 1857-1860.
  • 10Carneiro G, Chan A B, Moreno P J, et al. Supervised learning of semantic classes for image annotation and retrieval [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29(3): 394-410.

共引文献17

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部