期刊文献+

基于MapReduce的封闭立方体并行计算方法 被引量:8

A Parallel Closed-Cubing Algorithm Based on MapReduce
下载PDF
导出
摘要 封闭立方体是一种非常有效而重要的数据立方体压缩技术,目前还缺乏对其并行算法的研究.为此,文中提出一种采用C-Cubing方法并通过MapReduce并行模型进行并行化的新方法.该方法首先在Map过程中对各个数据分块计算出数据单元的代表元组和封闭掩码,然后在Reduce过程中进行聚合以获得封闭单元.实验结果表明,文中方法能有效地提高在大数据集上计算封闭立方体的速度. Although the closed cube is a high-efficiency and important technology for data cube compression, there is no research on its parallel algorithm at present. In this paper, a novel parallel approach combining the C-Cubing technology with the MapReduce framework is proposed. In this approach, the representative tuple and closed mask of each data cell for every data block are computed in the Map process, and the closed cells are obtained by the aggregation in the Reduce process. Experimental results show that the proposed approach greatly increases the computation speed of closed cubes in large-scale datasets.
出处 《华南理工大学学报(自然科学版)》 EI CAS CSCD 北大核心 2009年第1期91-95,112,共6页 Journal of South China University of Technology(Natural Science Edition)
基金 广东省科技计划项目(2004A10205003 2006B11301001) 广州市科技计划项目(2006Z3-D3081)
关键词 数据仓库 联机分析处理 并行算法 封闭立方体 MapReduce技术 data warehouse online analytical processing parallel algorithm closed cube MapReduce technology
  • 相关文献

参考文献13

  • 1Gray J, Chaudhuri S, Bosworth A, et al. Data cube : a relational aggregation operator generalizing group-by, crosstab, and sub-totals [ J]. Data Mining and Knowledge Discovery, 1997,1 ( 1 ) :29-53.
  • 2Lakshmanan L V S, Pei J, Han J W. Quotient cubes:how to summarize the semantics of a data cube [ C ]//Proceedings of the 28th International Conference .on Very Large Data Bases. Hong Kong: [ s. n. ] ,2002:778-789.
  • 3Lakshmanan L V S, Pei J, Zhao Y. QC-trees:an efficient summary structure for semantic OLAP [ C ]//Proceedings of ACM SIGMOD International Conference on Management of Data. San Diego:ACM,2003:64-75.
  • 4Beyer K, Ramakrishnan R. Bottom-up computation of sparse and iceberg CUBEs [C] //Proceedings of ACM SIGMOD International Conference on Management of Data. New York:ACM, 1999:359-370.
  • 5李盛恩,王珊.封闭数据立方体技术研究[J].软件学报,2004,15(8):1165-1171. 被引量:25
  • 6Xin D,Shao Z,Han J W,et al. C-Cubing:efficient computation of closed cubes by aggregation-based checking [ C ]// Proceedings of the 22nd International Conference on Data Engineering. Atlanta : IEEE, 2006:4 -4.
  • 7Chen Y, Dehne F, Eavis T. Parallel ROLAP data cube construction on shared-nothing muhiprocessors [ J ]. Distributed and Parallel Databases ,2004,15 ( 3 ) :219-236.
  • 8Sarawagi S, Agrawal R, Gupta A. On computing the data cube [R]. San Jose: IBM Almaden Research Center, 1996.
  • 9彭宏,谢嘉孟.联机分析中数据预计算的一种实现方法[J].华南理工大学学报(自然科学版),2000,28(4):17-20. 被引量:1
  • 10Ng R T, Wagner A, Yin Y. Iceberg-cube computation with PC clusters [ J ]. ACM SIGMOD Record, 2001,30 (2) :25-36.

二级参考文献15

  • 1陈文伟.数据仓库与决策支持系统[J].计算机世界,1998,(6):15-15.
  • 2陈文伟,计算机世界,1998年,2期
  • 3Lakshmanan LVS, Pei J, Han JW. Quotient cube: How to summarize the semantics of a data cube. In: Bressan S, Chaudhri AB, Lee ML, Yu JX, Lacroix Z, eds. Proc. of the 23rd Int'l Conf. on Very Large Data Bases. Hong Kong: Morgan Kaufmann, 2002. 778~789.
  • 4Sismanis Y, Deligiannakis A, Roussopoulos N, Kotidis Y. Dwarf: Shrinking the PetaCube. In: Franklin MJ, Moon B, Ailamaki A, eds. Proc. of the 2002 ACM SIGMOD Int'l Conf. on Management of Data. Madison: ACM Press, 2002. 464~475.
  • 5Mumick IS, Quass D, Mumick BS. Maintenance of data cubes and summary tables in a warehouse. In: Peckham J, ed. Proc. of the ACM SIGMOD Int'l Conf. on Management of Data. Tucson: ACM Press, 1997. 100-111.
  • 6Hahn C, Warren S, London J. Edited synoptic cloud reports from ships and land stations over the globe. 1996. http://cdiac.esd.ornl.gov/cdiac/ndps/ndp026b.html
  • 7Gray J, Bosworth A, Layman A, Pirahesh H. Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-totals. In: Su SYW, ed. Proc. of the 12th Int'l Conf. on Data Engineering. New Orleans: IEEE Computer Society, 1996. 152~159.
  • 8Agarwal S, Agrawal R, Deshpande PM, Gupta A, Naughton JF, Ramarkrishman R, Sarawagi S. On the computation of multidimensional aggregates. In: Vijayaraman TM, Buchmann AP, Mohan C, Sarda NL, eds. Proc. of the 22nd Int'l Conf. on Very Large Data Bases. Mumb
  • 9Zhao Y, Deshpande PM, Naughton JF. An array-based algorithm for simultaneous multidimensional. In: Peckham J, ed. Proc. of the ACM SIGMOD Int'l Conf. on Management of Data. Tucson: ACM Press, 1997. 159-170.
  • 10Ross KA, Srivastava D. Fast computation of sparse datacubes. In: Jarke M, Carey MJ, Dittrich KR, Lochovsky FH, Loucopoulos P, Jeusfeld MA, eds. Proc. of the 23rd Int'l Conf. on Very Large Data Bases. Athens: Morgan Kaufmann, 1997. 116~125.

共引文献24

同被引文献58

  • 1崔杰,李陶深,兰红星.基于Hadoop的海量数据存储平台设计与开发[J].计算机研究与发展,2012,49(S1):12-18. 被引量:141
  • 2牟雁超,李红燕,王腾蛟.PHCC:一种处理稀疏变化的封闭数据立方体算法[J].计算机研究与发展,2013,50(S2):85-93. 被引量:2
  • 3李盛恩,王珊.封闭数据立方体技术研究[J].软件学报,2004,15(8):1165-1171. 被引量:25
  • 4Dean J, Ghemawat S. MapReduce: Simplied Data Proessing on Large Clusters[ C] JJProceedings oi the 6th Conference on Symposium on Operating Systems Design & Implementation. [ s. 1. ] : USENIX Association, 2004.
  • 5Catanzaro B C, Sundaram N, Keutzer K. A Map Reduce Framework for Programming Graphics Processors [ C ]//Work- shop on Software Tools for MultiCore. [s. l. ]: Is. n. ] ,2006.
  • 6Ranger C, Raghuraman R, Penmetsa A, et al. Evaluating MapReduce for Multi-core and Multi processor Systems [ C ]//HPCA. [s. l. ] :[s. n. ] ,2007:13-24.
  • 7Sarje A, Aluru S. A MapReduce Style Framework for Trees [R]. [ s. 1. ]:Department of Electrical and Computer Engineering, 2008 : 17-18.
  • 8Hadoop. The Apache Software Foundation[ EB/OL]. 2010. http://hadoop, apache, org/core.
  • 9Bialecki A, Cafarella M, Cutting D, et al. Hadoop : a framework for running applications on large clusters built of commodity hardware [ EB/OL ]. 2005. http://lucene, apache. org/hadoop.
  • 10Dean J, Ghemawat S. MapReduee: Simplied Data Processing on Large Clusters[ C ]//Proceedings of the 6th Conference on Symposium on Operating Systems. Design & Implementation. [ s. 1. ] : USENIX Association, 2004.

引证文献8

二级引证文献42

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部