摘要
处理分布式环境下高速数据的最大挑战在于如何利用少量网络资源输出高质量的查询结果。对面向分布式环境的最近邻查询问题进行了研究,提出了一种基于过滤器的新方法,不仅能计算精确查询结果,还能够处理五类近似查询。该方法在各个远程站点均安装了智能过滤器,并通过合理设置过滤器的范围来降低数据传输量。理论分析及基于模拟数据集合和真实数据集合的实验报告均表明新方法具有较高的性能。
The biggest challenge to processing high-speed data over distributed environment is to output qualified results by using small amount of network resource. The paper studies how to cope with nearest neighbors query over distributed environment and proposes a novel solution, which is capable of answering not only precise query, but also five kinds of approximate queries. After installing a Smart Filter in each remote site to filter parts of incoming data, the novel approach continuously adjusts the range monitored by each filter to reduce the overall communication cost. Theoretic analysis and experimental results based on synthetic datasets and real dataset indicate that new approach owns good performance.
出处
《计算机科学与探索》
CSCD
2007年第2期146-159,共14页
Journal of Frontiers of Computer Science and Technology
基金
the Key Project of National Natural Science Foundation of China under Grant No.6049325
6049327(国家自然科学基金重大项目).