摘要
本文提出一种新的高维空间中点数据的索引方法 ,其基本原理是用格矢量量化 (Latticevectorquantiza tion)均匀划分数据空间、用倒排文件 (InvertedFile)存储格点、用Trie树实现倒排文件的组织和存储、用Trie并行搜索算法实现倒排文件的快速访问 .和传统索引方法相比 ,新方法具有许多优点 ,例如它能以较低的复杂度建立索引结构、支持非常高维的数据索引、充分利用高维空间中点分布的稀疏性等 .实验结果表明 ,在较高维数时 ,LIFT性能优于传统索引方法 .
A new method for indexing large amounts of points in high-dimensional space is proposed. The basic principle is as follows: uniformly partitioning the data space by lattice vector quantization, storing the lattice points by inverted file, organizing the inverted file by Trie tree, and fast accessing the inverted file by Trie parallel search algorithm. We call this index structure LIFT. Compared with the traditional index methods, the LIFT can build the index structure with low complexity, support very high dimensionality, and take advantage of sparsity of data points in high-dimensional space, etc. The experiments show that for high-dimensional data, the LIFT outperforms the well-known R-tree.
出处
《电子学报》
EI
CAS
CSCD
北大核心
2001年第2期192-195,共4页
Acta Electronica Sinica
基金
自然科学基金重点项目! (No.699350 1 0 )
国家863-317-01-07-99
自然科学基金项目! (No.60 0 0 30 1 7)
关键词
索引结构
相似性检索
高维数据索引
程序设计
Computer simulation
Data processing
Information retrieval
Parallel algorithms
Trees (mathematics)
Vector quantization