摘要
流数据处理和多维空间中子空间上Skyline的计算是近年来数据管理与数据挖掘领域的研究热点.此前相关工作只专注于滑动窗口上Skyline的维护问题,未涉及到滑动窗口中子空间Skyline的计算.文中提出了一个基于网格索引的高效维护滑动窗口上Skyline的算法,以此为基础采用自顶向下的方式通过两个阶段增量式地返回目标子空间上的结果;开发的多个剪枝策略和启发式优化方法显著地提高了全空间Skyline的维护以及子空间Skyline的计算效率.理论分析和实验结果表明:与同类算法相比,文中提出的StreamSubsky算法以极少的时间开销就能输出第一个结果,并且算法具有良好的可扩展性.
Data stream processing and Skyline computing within subspaces have recently received a lot of attentions in database and data mining community. Previous works only sought to maintain full space Skylines over sliding windows. No one considered the problem of computing subspace Skylines over sliding windows. In this paper, the authors adopt an efficient full space Skyline maintaining method leveraging grid index structure, and based on this, propose an efficient top-town algorithm to incrementally output the Skyline objects within specified subspaces. Moreover, the authors present a set of optimizing heuristics to speed up these processes. Theoretical analysis and extensive experiments demonstrate that the proposed algorithm can generate the first result only taking a little overhead compared with previous works, and the methods are both efficient and scalable.
出处
《计算机学报》
EI
CSCD
北大核心
2007年第8期1418-1428,共11页
Chinese Journal of Computers
基金
国家"九七三"重点基础研究发展规划项目基金(2005CB321905)资助~~