摘要
当前在图像分析领域,将局部特征编码为视觉单词的做法非常流行。基于普通的视觉单词,提出了一种新的能够融合单词多层上下文的核函数。设计中体现了如下信息:1)多层的单词直方图;2)多层的"词组"直方图;3)单词(以及词组)的上下文的类别。然后将该核函数应用于支持向量机,对图像进行分类。在Corel图像库等公共测试集上,该方法取得出色的性能。此外,在一个实用性很强的复杂问题中进行了对比:识别成人图像和泳装图像。该方法的识别准确率,比经典方法提高了约7%。实验结果表明,将核函数度量同视觉单词的多层次描述结合在一起,能够显著提高图像的识别能力。
In recent literature of image analysis, it has been very popular to code local features into visual words. We propose a novel kernel which fuses multi-level contexts of visual words. Besides the histogram pyramid of words, our kernel also incorporates the histogram pyramid of visual phrases (the local co-occurrence patterns of words) and the context classes of those words and phrases. Then support vector machines using the kernel are trained to perform image classification. Our method performs well on a wide range of test data, such as the Corel dataset. The method is also tested in a challenging problem, the discrimination of pornographic images from bikini ones. The classification accuracy of our method is 7% higher than that of the baseline method. Experimental results demonstrate that the performance of image classification can be improved by the integration of kernel based measurements and the multi-level representation of visual words. In the future work, more compact and efficient representation of contexts should be researched.
出处
《中国图象图形学报》
CSCD
北大核心
2010年第4期607-616,共10页
Journal of Image and Graphics
基金
国家高技术研究发展计划(863)项目(2003AA142140)
国家自然科学基金项目(60702035)
关键词
图像分类
核函数
支持向量机
视觉单词
多分辨率直方图
image classification, kernel, support vector machine, visual word, multi-resolution histogram