Fast searches and query operations in high dimensional databases require efficient index structures. Amonga variety of index structures, the index structures in metric spaces are very useful. They can be used in an ex...Fast searches and query operations in high dimensional databases require efficient index structures. Amonga variety of index structures, the index structures in metric spaces are very useful. They can be used in an extensivefield, such as searching for protein molecular chains with certain sequences in Computational Biology and matching agiven strings fuzzily in Text Retrieval. In this paper, the features of index structures in metric spaces are analyzedand subsequently a further classification is given to these index structures. Finally, some representative index struc-tures are introduced in detail.展开更多
Various index structures have recently been proposed to facilitate high-dimensional KNN queries, among which the techniques of approximate vector presentation and one-dimensional (1D) transformation can break the curs...Various index structures have recently been proposed to facilitate high-dimensional KNN queries, among which the techniques of approximate vector presentation and one-dimensional (1D) transformation can break the curse of dimensionality. Based on the two techniques above, a novel high-dimensional index is proposed, called Bit-code and Distance based index (BD). BD is based on a special partitioning strategy which is optimized for high-dimensional data. By the definitions of bit code and transformation function, a high-dimensional vector can be first approximately represented and then transformed into a 1D vector, the key managed by a B+-tree. A new KNN search algorithm is also proposed that exploits the bit code and distance to prune the search space more effectively. Results of extensive experiments using both synthetic and real data demonstrated that BD out- performs the existing index structures for KNN search in high-dimensional spaces.展开更多
文摘Fast searches and query operations in high dimensional databases require efficient index structures. Amonga variety of index structures, the index structures in metric spaces are very useful. They can be used in an extensivefield, such as searching for protein molecular chains with certain sequences in Computational Biology and matching agiven strings fuzzily in Text Retrieval. In this paper, the features of index structures in metric spaces are analyzedand subsequently a further classification is given to these index structures. Finally, some representative index struc-tures are introduced in detail.
基金Project (No. [2005]555) supported by the Hi-Tech Research and De-velopment Program (863) of China
文摘Various index structures have recently been proposed to facilitate high-dimensional KNN queries, among which the techniques of approximate vector presentation and one-dimensional (1D) transformation can break the curse of dimensionality. Based on the two techniques above, a novel high-dimensional index is proposed, called Bit-code and Distance based index (BD). BD is based on a special partitioning strategy which is optimized for high-dimensional data. By the definitions of bit code and transformation function, a high-dimensional vector can be first approximately represented and then transformed into a 1D vector, the key managed by a B+-tree. A new KNN search algorithm is also proposed that exploits the bit code and distance to prune the search space more effectively. Results of extensive experiments using both synthetic and real data demonstrated that BD out- performs the existing index structures for KNN search in high-dimensional spaces.