摘要
近年来,随着多媒体信息检索技术的不断发展,如何实现高维特征矢量的快速相似性查询成为一个重要的研究课题·为此,人们提出了许多索引结构,包括:R-Tree及其变种、对矢量进行量化近似的VA-File、引入量化思想的A-Tree等等·从公开发表的成果看,这些索引结构在较低维数时,都能够表现出较好的查询性能;而当维数增加时,性能则急剧恶化·为了在更高维数下实现快速相似查询,可采用VA-File和A-Tree中的近似思想,并借助Trie结构来组织和管理压缩后的近似矢量,即所谓的VA-Trie·实验结果表明,在高达128维时VA-Trie仍有查询加速,其性能远好于A-Tree·
Since 1990's, great progress has been made in the area of content-based multimedia information retrieval. A very challenging problem emerged at the same time: how to organize high dimensional vectors so that efficient similarity query could be realized. Many index structures have been proposed to solve this problem, such as R-Tree and its variants, VA-File, A-Tree etc. From the published results, it can be concluded that most of these methods could achieve good query performance when the dimensionality is less than 20. However, the performance suffers greatly as the dimensionality increases. To obtain efficient similarity query in higher dimensional spaces, a new index structure called VA-Trie is introduced. The key idea behind VA-Trie is adopting the idea of quantization to compress the vectors and then employing the Trie structure to organize and manage the approximations. The experimental results show that VA-Trie outperforms A-Tree and sequential scan in high dimensional spaces.
出处
《计算机研究与发展》
EI
CSCD
北大核心
2005年第12期2213-2218,共6页
Journal of Computer Research and Development
基金
国家自然科学基金项目(60402007
60373020)
国家"八六三"高技术研究发展计划基金项目(2002AA103011-5)
上海市科技发展基金项目(03DZ15019
03DZ14015)
教育部科学技术研究重点基金项目(104075)
关键词
索引结构
相似性查询
信息检索
index structure
similarity query
information retrieval