期刊文献+

在PC集群上的封闭立方体计算 被引量:1

Closed Cubing on PC Clusters
下载PDF
导出
摘要 封闭立方体是联机分析处理中一种有效的数据立方体压缩技术,但封闭立方体的并行算法目前很少有相关文献研究。提出了一种简单而实用的解决方案,即基于MapReduce计算框架,在非共享内存的PC集群上对封闭立方体进行分布式的预计算和查询。相关实验表明,本方法能快速处理千万级的数据,具有较好的线性加速比,而且能够更大地压缩数据立方体存储空间。 The closed cube is a very efficient technology for the data cube compression in OLAP, but its parallel algorithm is little studied in the literature. This paper presented a solution that is easy to be implemented and applied. It parallelizes the closed cube construction and the query answering based on the MapReduce framework over low cost share nothing PC clusters. The experiments show that our approach can rapidly process huge data sets with at least 10 million rows and get pretty good linear speedup. Further, it compresses data cubes much greater.
出处 《计算机科学》 CSCD 北大核心 2009年第6期153-155,161,共4页 Computer Science
基金 广东省国际科技合作计划项目(2007A050100026) 广东省科技计划项目(2006B11301001) 广东省工业科技攻关计划项目(2006B80407001)资助
关键词 联机分析处理 并行计算 封闭立方体 MapReduce技术 OLAP, Parallel computation, Closed cube, MapReduce
  • 相关文献

参考文献12

  • 1Gray J, Chaudhuri S, Bosworth A, et al. Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross- Tab, and Sub-Totals [J].Data Mining and Knowledge Discovery, 1997,1 (1) : 29-53
  • 2Sismanis Y, Deligiannakis A, Roussopoulos N, et al. Dwarf: Shrinking the PetaCube [C]//SIGMOD. 2002 : 464-475
  • 3Lakshrnanan L V S, Pei J,Han J W. Quotient Cubes: How to Summarize the Semantics of a Data Cube [C]// VLDB. 2002: 778-789
  • 4Lakshrnanan L V S, Pei J, Zhao Y. QCTrees: An Efficient Summary Structure for Semantic OLAP [C]//SIGMOD. 2003. 64-75
  • 5Beyer K, Ramakrishnan tL Bottom-Up Computation of Sparse and Iceberg CUBEs [C] // SIGMOD. 1999
  • 6Xin D, Shao Z, Han J W, et al. C-Cubing.- Efficient Computation of Closed Cubes by Aggregation-based Checking [C]//ICDE. 2006 : 4
  • 7李盛恩,王珊.封闭数据立方体技术研究[J].软件学报,2004,15(8):1165-1171. 被引量:25
  • 8Ng R T, Wagner A, Yin Y. Iceberg-cube Computation with PC Clusters [C]//SIGMOD. 2001, 30(2) . 25-36
  • 9Frank D, Todd E, Andrew R. The cgmCUBE project: Optimizing parallel data cube generation for ROLAP [J]. Distributed and Parallel Databases, 2006, 19(1): 29-62
  • 10Chen Y, Dehne F, Eavis T. Parallel ROLAP Data Cube Construction on Shared-nothing Multiprocessors [J].Distributed and Parallel Databases, 2004, 15(3):219-236

二级参考文献13

  • 1Lakshmanan LVS, Pei J, Han JW. Quotient cube: How to summarize the semantics of a data cube. In: Bressan S, Chaudhri AB, Lee ML, Yu JX, Lacroix Z, eds. Proc. of the 23rd Int'l Conf. on Very Large Data Bases. Hong Kong: Morgan Kaufmann, 2002. 778~789.
  • 2Sismanis Y, Deligiannakis A, Roussopoulos N, Kotidis Y. Dwarf: Shrinking the PetaCube. In: Franklin MJ, Moon B, Ailamaki A, eds. Proc. of the 2002 ACM SIGMOD Int'l Conf. on Management of Data. Madison: ACM Press, 2002. 464~475.
  • 3Mumick IS, Quass D, Mumick BS. Maintenance of data cubes and summary tables in a warehouse. In: Peckham J, ed. Proc. of the ACM SIGMOD Int'l Conf. on Management of Data. Tucson: ACM Press, 1997. 100-111.
  • 4Hahn C, Warren S, London J. Edited synoptic cloud reports from ships and land stations over the globe. 1996. http://cdiac.esd.ornl.gov/cdiac/ndps/ndp026b.html
  • 5Gray J, Bosworth A, Layman A, Pirahesh H. Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-totals. In: Su SYW, ed. Proc. of the 12th Int'l Conf. on Data Engineering. New Orleans: IEEE Computer Society, 1996. 152~159.
  • 6Agarwal S, Agrawal R, Deshpande PM, Gupta A, Naughton JF, Ramarkrishman R, Sarawagi S. On the computation of multidimensional aggregates. In: Vijayaraman TM, Buchmann AP, Mohan C, Sarda NL, eds. Proc. of the 22nd Int'l Conf. on Very Large Data Bases. Mumb
  • 7Zhao Y, Deshpande PM, Naughton JF. An array-based algorithm for simultaneous multidimensional. In: Peckham J, ed. Proc. of the ACM SIGMOD Int'l Conf. on Management of Data. Tucson: ACM Press, 1997. 159-170.
  • 8Ross KA, Srivastava D. Fast computation of sparse datacubes. In: Jarke M, Carey MJ, Dittrich KR, Lochovsky FH, Loucopoulos P, Jeusfeld MA, eds. Proc. of the 23rd Int'l Conf. on Very Large Data Bases. Athens: Morgan Kaufmann, 1997. 116~125.
  • 9Harinarayan V, Rajaraman A, Ullman JD. Implementing data cubes efficiently. In: Jagadish HV, Mumick IS, eds. Proc. of the 1996 ACM SIGMOD Int'l Conf. on Management of Data. Montreal: ACM Press, 1996. 205-216.
  • 10Shukla A, Deshpande PM, Naughton JF. Materialized view selection for multidimensional datasets. In: Gupta A, Shmueli O, Widom J, eds. Proc. of the 24th Int'l Conf. on Very Large Data Base. New York: Morgan Kaufmann, 1998. 488~499.

共引文献24

同被引文献19

  • 1李盛恩,王珊.封闭数据立方体技术研究[J].软件学报,2004,15(8):1165-1171. 被引量:25
  • 2Li Xiaolei,Han Jiawei,Gonzalez H.High-dimensional OLAP:a minimal cubing approach. Proc of the30th Intl Conf on Very Large Data Bases . 2004
  • 3Jianzhong Li,Jaideep Srivastava.Efficient Aggregation Algorithms for Compressed Data Warehouses. IEEE Transactions on Knowledge and Data Engineering . 2002
  • 4Xin D,Han J,Li X. et al.Star-Cubing: Computing Iceberg Cubes by Top-Down and Bottom-Up Integration. Proceedings of VLDB . 2003
  • 5Wang W,Lu H,Feng J,et al.Condensed Cube: An Effective Approach to Reducing Data Cube Size. Proceedings of the 18th International Conference on Data Engineering . 2002
  • 6Y Sismanis,A Deligiannakis,Y Kotidis,et al.Hierarchical dwarfs for the rollup cube. Proc of ACM6th Int’l Workshop on Data Warehousing and OLAP . 2003
  • 7Brezany P,Hofer J,Tjoa A M,et al.Towards an openservice architecture for data mining on the grid. Proceedings of the Conference on Database and ExpertSystems Applications . 2003
  • 8YiHong Zhao,Prasad Deshpande,F Naughton.An array-based algorithm for simultaneous multidimensional aggregates. Proceedings of the 1997 ACM SIGMOD Conference on Management of Data . 1997
  • 9Harinarayan V,Rajaraman A,Ullman J D.Implementing data cubes efficiently. Proceedings of the ACM SIGMOD International Conference of Management of Data . 1996
  • 10Shukla A,Deshpande PM,Naughton JF.Materialized View Selection for Multidimensional Datasets. VLDB’98,Proceedings of the 24th International Conference on Very Large Data Bases . 1998

引证文献1

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部