摘要
互联网上每天都会产生大量的带地理位置标签和时间标签的信息,比如微博、新闻、团购等等,如何在众多的信息中找到在时间和空间地理位置上都满足用户查询需求的信息十分重要.针对这一需求,提出了一种对地理位置和时间信息的k近邻查询(ST-k NN查询)处理方法.首先,利用时空相似度对数据对象的地理位置变量和时间变量进行映射变换,将数据对象映射到新的三维空间中,用三维空间中两点之间的距离相似度来近似代替两个对象之间实际的时空相似度;然后,针对这个三维空间设计了一种ST-Rtree(spatial temporal rtree)索引,该索引综合了空间因素和时间因素,保证在查询时每个对象至多遍历1次;最后,在该索引的基础上提出了一种精确的k近邻查询算法,并通过一次计算确定查询结果范围,从而找到前k个结果,保证了查询的高效性.基于大量数据集的实验,证明了该查询处理方法的高效性.
Large amounts of content with location and time tags are generated every day on webs such as microblog, news, and group-buying. Thus, it is important to find top-k results that satisfy users' temporal and spatial requirements from the contents. In this paper, a novel kNN query (called ST-kNN query) processing approach is proposed for content with location and time tags. First, location variables and time variables of data objects are transformed via temporal & spatial similarity in order to map data objects to a new three-dimensional space. Next, the spatial similarity between two objects in the three-dimensional space is used to approximate the actual temporal & spatial similarity. Then, a new index called ST-Rtree is designed in this three-dimensional space. The index combines location variables & time variables, and ensures every object is traversed no more than once. At last, an exact kNN query algorithm is proposed. The region is determined by computing only once to find top-k results, which guarantees high-efficiency in the query processing. Experiments on large datasets demonstrate that the presented query processing approach is very efficient.
出处
《软件学报》
EI
CSCD
北大核心
2016年第9期2278-2289,共12页
Journal of Software
基金
国家自然科学基金(61472070)
国家重点基础研究发展计划(973)(2012CB316201)~~
关键词
地理位置
时间
时空相似度
索引
K最近邻查询
location
time
temporal & spatial similarity
index
k nearest neighbor query