Many recent state-of-the-art image retrieval approaches are based on Bag-of-Visual-Words model and represent an image with a set of visual words by quantizing local SIFT(scale invariant feature transform) features. ...Many recent state-of-the-art image retrieval approaches are based on Bag-of-Visual-Words model and represent an image with a set of visual words by quantizing local SIFT(scale invariant feature transform) features. Feature quantization reduces the discriminative power of local features and unavoidably causes many false local matches between images, which degrades the retrieval accuracy. To filter those false matches, geometric context among visual words has been popularly explored for the verification of geometric consistency. However, existing studies with global or local geometric verification are either computationally expensive or achieve limited accuracy. To address this issue, in this paper, we focus on partialduplicate Web image retrieval, and propose a scheme to encode the spatial context for visual matching verification. An efficient affine enhancement scheme is proposed to refine the verification results. Experiments on partial-duplicate Web image search, using a database of one million images, demonstrate the effectiveness and efficiency of the proposed approach.Evaluation on a 10-million image database further reveals the scalability of our approach.展开更多
基金supported in part to Dr.Wen-Gang Zhou by the Fundamental Research Funds for the Central Universities of China under Grant Nos.WK2100060014 and WK2100060011the Start-Up Funding from the University of Science and Technology of China under Grant No.KY2100000036+6 种基金the Open Project of Beijing Multimedia and Intelligent Software Key Laboratory in Beijing University of Technology,and the sponsor from Intel ICRI MNC projectin part to Dr.Hou-Qiang Li by the National Natural Science Foundation of China(NSFC)under Grant Nos.61325009,61390514,and 61272316in part to Dr.Yijuan Lu by the Army Research Office(ARO)of USA under Grant No.W911NF-12-1-0057the National Science Foundation of USA under Grant No.CRI 1305302in part to Dr.Qi Tian by ARO under Grant No.W911NF-12-1-0057the Faculty Research Award by NEC Laboratories of America,respectivelywas supported in part by NSFC under Grant No.61128007
文摘Many recent state-of-the-art image retrieval approaches are based on Bag-of-Visual-Words model and represent an image with a set of visual words by quantizing local SIFT(scale invariant feature transform) features. Feature quantization reduces the discriminative power of local features and unavoidably causes many false local matches between images, which degrades the retrieval accuracy. To filter those false matches, geometric context among visual words has been popularly explored for the verification of geometric consistency. However, existing studies with global or local geometric verification are either computationally expensive or achieve limited accuracy. To address this issue, in this paper, we focus on partialduplicate Web image retrieval, and propose a scheme to encode the spatial context for visual matching verification. An efficient affine enhancement scheme is proposed to refine the verification results. Experiments on partial-duplicate Web image search, using a database of one million images, demonstrate the effectiveness and efficiency of the proposed approach.Evaluation on a 10-million image database further reveals the scalability of our approach.