期刊文献+

基于兴趣视图子集的流立方体计算方法 被引量:2

A Method of Stream Cube Computing Based on Interesting View Subset
下载PDF
导出
摘要 流立方体计算是流式数据多维分析的重要基础,然而流式数据的动态性、无限性、突发性等特征使其面临巨大的挑战.在实际应用中,用户的兴趣通常集中在部分视图上,基于这个特点提出了一种基于兴趣视图子集的计算方法,依据用户历史查询信息确定兴趣视图子集与兴趣路径,同时定义了Stream-Tree结构用于在主存中物化存储兴趣视图子集所包含的数据单元,在运行过程中依据多层次时间窗口约束不断更新和维护Stream-Tree中存储的数据单元,而对于稀疏数据单元仅保留高层次的聚集值.实验和分析表明,该方法能够在有限的主存空间中维持流立方体当前窗口内的数据单元,同时能够支持快速更新维护存储结构和响应用户查询. Stream cube computing is the important foundation of data stream multidimensional analysis. But the features of data stream (dynamic, infinity, bursty, etc) and complexity of multidimensional data structure, are confronted with great challenges, such as storage space, updating efficiency, adaptability, and so on. In many applications, users often focus on only a portion of views. A computing method based on interesting view subset is proposed in this paper. Interesting view subset and interesting path can be obtained by the information of historical queries. And if the efficiency of answering queries decreases, it should be updated with the lapse of time. The Stream-Tree structure is defined for maintaining the cells of interesting view subset and drilling paths in memory. In the running phase, the cells of Stream-Tree are continuously updated with new tuple arriving, and the old cells are deleted periodically according to the constraints of multi-level time windows. The sparse cells of Stream-Tree will not be divided into finer ones, only the high level aggregations are preserved. Experiments and analysis results indicate that the method is efficient in maintaining the stream cube cells of current time window in finite memory, and can answer the queries of users quickly.
出处 《计算机研究与发展》 EI CSCD 北大核心 2011年第12期2369-2378,共10页 Journal of Computer Research and Development
基金 国家自然科学基金项目(70771110)
关键词 流式数据 流立方体 多维分析 兴趣视图子集 多层次时间窗口 data stream stream cube multidimensional analysis interesting view subset multi-level time windows
  • 相关文献

参考文献18

  • 1Gray J, Bosworth A, Layman A, et al. Data cube: A relational aggregation operator generalizing group-by, cross tab, and sub-totals [C] //Proc of the 12th Int Conf on Data Engineering. Los Alamitos, CA: IEEE Computer Society, 1996:152-159.
  • 2Xin D, Han J, Li X, et al. Computing iceberg cubes by top- down and bottom-up integration: The StarCubing approach[J]. IEEE Trans on Knowledge and Data Engineering, 2007, 19(1): 111-126.
  • 3向隆刚,龚健雅.一种高度浓缩和语义保持的数据立方[J].计算机研究与发展,2007,44(5):837-844. 被引量:5
  • 4师智斌,黄厚宽.基于形式概念分析的约简数据立方体研究[J].计算机研究与发展,2009,46(11):1956-1962. 被引量:6
  • 5Lakshmanan L V S, Pei J, Zhao Y. QCTrees: An efficient summary structure for semantic OLAP [C] //Proc of the 2003 ACM SIGMOD Int Conf on Management of Data. New York: ACM, 2003: 64-75.
  • 6Furfaro F, Mazzeo G M, Sacca D, et al. Compressed hierarchical binary histograms for summarizing multi- dimensional data [J]. Knowledge and Information Systems,2007, 15(3): 335-380.
  • 7Vitter J S, Wang M. Approximate computation of multidimensional aggregates of sparse data using wavelets [C] //Proc of the 1999 ACM SIGMOD Int Conf on Management of Data. New York: ACM, 1999:193-204.
  • 8Hsieh M J, Chen M S, Yu P S. Approximate query processing in cube streams [J]. IEEE Trans on Knowledge and Data Engineering, 2007, 19(11): 1557-1570.
  • 9Cuzzocrea A, Wang Wei. Approximate range-sum query answering on data cubes with probabilistic guarantees [J]. Journal of Intelligent Information Systems, 2007, 28 (2): 161-197.
  • 10Li X, Han J, Yin Z, et al. Sampling cube: A framework for statistical OLAP over sampling data [C] //Proc of the 2008 ACM SIGMOD Int CoM on Management of Data. New York: ACM, 2008: 779-790.

二级参考文献20

  • 1曲开社,翟岩慧.偏序集、包含度与形式概念分析[J].计算机学报,2006,29(2):219-226. 被引量:52
  • 2J Gray, A Bosworth, A Layman, et al. Data cube; A relational operator generalizing group-by, cross-tab, and sub-totals [C]. In: Proc of the 1996 Int'l Conf on Data Engineering. Los Alamitos, CA: IEEE Computer Society Press, 1996. 152-159.
  • 3S Agrawal, R Agreal, P M Deshpande, et al. On the computation of multidimensional aggregates [CJ. In: Proc of the 1996 Int'l Conf on Very Large Data Bases. San Francisco: Morgan Kaufmann, 1996. 506-521.
  • 4K Ross, D Srivastava. Fast computation of sparse data cubes. In: Proc of the 1997 Int'l Conf on Very Large Data Bases. San Francisco: Morgan Kaufmann, 1997. 116-125.
  • 5Y Zhao, P deshpande, J F Naughton. An array-based algorithm for slmultanous multidimensional aggregates [C]. In: Proc of the 1997 ACM SIGMOD Int'l Conf on Management of Data. New York: ACM Press, 1997. 159-170.
  • 6E Baralis, S Paraboschi, E Teniente. Materialized views selection in a multidimensional database [C]. In: Proc of the 1997 Int'l Conference on Very Large Data Bases. San Francisco: Morgan Kaufmann, 1997. 156-165.
  • 7K Beyer, R Ramakrishnan. Bottom-up computation of sparse and iceberg CUBEs [C]. In: Proc of the 1999 ACM SIGMOD Int'l Conf on Management of Data. New York: ACM Press, 1999. 359-370.
  • 8M Fang, N Shivkumar, H Garcia-Molina, et al. Computing iceberg queries efficiently [C]. In: Proc of the 1998 Int'l Conf on Very Large Data Bases. San Francisco: Morgan Kaufmann, 1998. 299-310.
  • 9J Han, J Pei, G Dong, et al. Efficient computation of iceberg cubes with complex measures [C]. In: Proc of the 2001 ACM SIGMOD Int' 1 Conf on Management of Data. New York: ACM Press, 2001. 1-12.
  • 10V Harinarayan, A Rajaraman, J D Ullman. Implementing data cubes efficiently [C]. In: Proc of the 1996 ACM SIGMOD Int'l Conf on Management of Data. New York: ACM Press, 1996. 205-216.

共引文献9

同被引文献14

引证文献2

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部