The query space of a similarity query is usually narrowed down by pruning inactive query subspaces which contain no query results and keeping active query subspaces which may contain objects corre- sponding to the req...The query space of a similarity query is usually narrowed down by pruning inactive query subspaces which contain no query results and keeping active query subspaces which may contain objects corre- sponding to the request. However, some active query subspaces may contain no query results at all, those are called false active query subspaces. It is obvious that the performance of query processing degrades in the presence of false active query subspaces. Our experiments show that this problem becomes seriously when the data are high dimensional and the number of accesses to false active subspaces increases as the dimensionality increases. In order to solve this problem, this paper proposes a space mapping approach to reducing such unnecessary accesses. A given query space can be refined by filtering within its mapped space. To do so, a mapping strategy called maxgap is proposed to improve the efficiency of the refinement processing. Based on the mapping strategy, an index structure called MS-tree and algorithms of query processing are presented in this paper. Finally, the performance of MS-tree is compared with that of other competitors in terms of range queries on a real data set.展开更多
基金Supported by National Basic Research Program of China (Grant No.2006CB303103)the National Natural Science Foundation of China (Grant Nos.60873011,60802026,60773219,60773021)the High Technology Program (Grant No.2007AA01Z192)
文摘The query space of a similarity query is usually narrowed down by pruning inactive query subspaces which contain no query results and keeping active query subspaces which may contain objects corre- sponding to the request. However, some active query subspaces may contain no query results at all, those are called false active query subspaces. It is obvious that the performance of query processing degrades in the presence of false active query subspaces. Our experiments show that this problem becomes seriously when the data are high dimensional and the number of accesses to false active subspaces increases as the dimensionality increases. In order to solve this problem, this paper proposes a space mapping approach to reducing such unnecessary accesses. A given query space can be refined by filtering within its mapped space. To do so, a mapping strategy called maxgap is proposed to improve the efficiency of the refinement processing. Based on the mapping strategy, an index structure called MS-tree and algorithms of query processing are presented in this paper. Finally, the performance of MS-tree is compared with that of other competitors in terms of range queries on a real data set.