摘要
针对传统的视觉词袋(bag of visual words,BoVW)模型忽略了视觉单词的空间位置信息的问题,文章提出一种基于视觉单词共生矩阵的图像分类方法。首先对整幅图像进行空间金字塔分解,得到一系列图像块;然后针对每一图像块中的SIFT点,在其空间邻域范围内构建视觉单词共生矩阵(visual words co-occurrence matrix,VWCM)单元,并得到该图像块对应的视觉单词共生矩阵;最后设计出一种新的空间金字塔共生矩阵核(spatial pyramid co-occurrence matrix kernel,SPCMK),并将其用于图像分类。该方法能够有效地刻画视觉单词的绝对和相对位置信息,极大地增强了图像表达的完整度与准确度。实验结果表明,文章方法确实能够大幅度提高图像分类的准确率。
Considering the absence of spatial location information of visual words in the conventional Bag-of-Visual-Words model, this paper presents a novel image categorization approach based on visual words co-occurrence matrix. Firstly, a sequence of image regions is gained using spatial pyramid to partition the whole image. Then, for each SIFT located in the image region, a Visual Words Co- occurrence Matrix (VWCM) unit is constructed based on the spatial context region of this SIFT point, resulting in the VWCM for the image region. Finally, a new matching kernel termed Spatial Pyramid Co-occurrence Matrix Kernel (SPCMK) is designed and used for image categorization. Capturing both the absolute and relative spatial location information of visual words effectively, this new approach vastly enhances the completeness and correctness to understand images. Experimental results indicate that the proposed approach achieves higher classification rates.
出处
《信息工程大学学报》
2013年第4期439-446,共8页
Journal of Information Engineering University
基金
国家自然科学基金资助项目(60872142)
关键词
图像分类
视觉单词
空间金字塔
视觉单词共生矩阵
空间金字塔共生矩阵核
image categorization
visual words
spatial pyramid
visual words co-occurrence matrix
spatial pyramid co-occurrence matrix kernel