期刊文献+

基于Map/Reduce的外壳片段立方体并行计算方法 被引量:4

Parallel computation of shell fragments cube Map/Reduce-based
下载PDF
导出
摘要 针对高维、维度分层的大数据集,提出一种基于Map/Reduce框架的并行外壳片段立方体构建算法。算法采用Map/Reduce框架,实现外壳片段立方体的并行构建与查询。构建算法在Map过程中,计算出各个数据分块所有可能的数据单元或层次维编码前缀;在Reduce过程中,聚合计算得到最终的外壳片段和度量索引表。实验证明,并行外壳片段立方体算法一方面结合了Map/Reduce框架的并行性和高扩展性,另一方面结合了外壳片段立方体的压缩策略和倒排索引机制,能够有效避免高维数据物化时数据量的爆炸式增长,提供快速构建和查询操作。 In the high-dimensional and dimension hierarchical big data materializing, this paper proposes an efficient parallel shell fragments cube construction algorithm using Map/Reduce framework. The algorithm achieves parallel building and querying of shell fragments cube. For each data partition, map process of the construction algorithm calculates all possible data unit or prefix B encoding; Reduce process aggregates to calculate the ultimate shell fragments and measure index table. Experiments show that the parallel shell fragments cube algorithm not only combines the parallelism and scalability of Map/Reduce framework, but also combines the compression strategy and inverted index structure of shell fragments cube. The parallel shell fragments cube algorithm can effectively avoid the explosion of data volumes while materializing high-dimensional data, and provides the quick build and query operations.
出处 《计算机工程与应用》 CSCD 北大核心 2015年第22期124-129,共6页 Computer Engineering and Applications
基金 水利部公益性行业科研专项(No.201501022)
关键词 联机分析处理 外壳片段立方体 Map/Reduce技术 并行计算 On-Line Analysis Processing(OLAP) shell fragments cube Map/Reduce parallel computation
  • 相关文献

参考文献16

  • 1Gray J, Chaudhuri S, Bosworth A, et al.Data cube: A rela- tional aggregation operator generalizing group-by, cross-tab, and sub-totals[J].Data Mining and Knowledge Discovery, 1997,1 ( 1 ) : 29-53.
  • 2Jiang F M, Pei J, Fu A W.Ix-cubes : iceberg cubes for data warehousing and Olap on xml data[C]//Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management.ACM, 2007 : 905-908.
  • 3Wang Z, Xu Y.Minimal condensed cube: data organization, fast computation, and incremental update[C]//International Conference on Internet Computing in Science and Engi- neering.IEEE, 2008 : 60-67.
  • 4Li C, Cong G, Tung A K H, et al.Incremental maintenance of quotient cube for median[C]//Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM,2004:226-235.
  • 5Li X,Han J,Gonzalez H.High-dimensional OLAP:A mini- mal cubing approach[C]//Proceedings of the Thirtieth Inter- national Conference on Very Large Data bases-Volume 30 ,VLDB Endowment, 2004 : 528-539.
  • 6Dean J, Ghemawat S.Map/Reduce: simplified data processing on large clusters[J].Communications of the ACM,2008, 51(1) : 107-113.
  • 7朱凯,万定生,程习锋.水利普查成果分析中数据立方体计算研究[J].计算机与数字工程,2014,42(9):1591-1594. 被引量:3
  • 8Markl V, Ramsak F, Bayer R.Improving OLAP perfor- mance by multidimensional hierarchical clustering[C]// Proc of IDEAS'99,1999: 165-177.
  • 9胡孔法,陈崚,顾颀,蔡俊杰,董逸生.数据仓库系统中一种高效的多维层次聚集算法[J].计算机集成制造系统,2007,13(1):196-201. 被引量:4
  • 10Stockinger K.Bitmap indices for speeding up high-dimen- sional data analysis[M]//Database and Expert Systems Applications.Berlin Heidelberg: Springer, 2002: 881-890.

二级参考文献36

  • 1胡孔法,陈崚,顾颀,蔡俊杰,董逸生.数据仓库系统中一种高效的多维层次聚集算法[J].计算机集成制造系统,2007,13(1):196-201. 被引量:4
  • 2Wang W, Lu H J, Feng J L, et al. Condensed cube: an effective approach to reducing data cube size[C]//Procee- dings of the 18th International Conference on Data Engi-neering (ICDE), San Jose, California, USA, 2002. Wash- ington, DC, USA: IEEE Computer Society, 2002: 155-165.
  • 3Lakshmanan L V S, Pei J, Han J W. Quotient cube: how to summarize the semantics of a data cube[C]//Procee- dings of the 28th International Conference on Very Large Data Bases (VLDB), Hong Kong, China, 2002. [S.l.]: Morgan Kaufmann, 2002: 778-789.
  • 4Sismanis Y, Deligiannakis A, Roussopoulos N, et al. Dwarf: shrinking the PetaCube[C]//Proceedings of the ACM SIGMOD International Conference on Manage- ment of Data, Madison, Wisconsin, USA, 2002. New York, NY, USA: ACM Press, 2002: 464-475.
  • 5Sismanis Y, Deligiannakis A, Kotidis Y, et al. Hierarchical Dwarfs for the rollup cube[C]//Proceedings of the 6th ACM International Workshop on Data Warehousing and OLAP (DOLAP), New Orleans, Louisiana, USA, 2003. New York, NY, USA: ACM Press, 2003: 17-24.
  • 6Leng F L, Bao Y B, Wang D L, et al. A clustered Dwarf structure to speed up queries on data cubes[C]//Procee- dings of the 9th International Conference on Data Ware- housing and Knowledge Discovery (DaWak), Regensburg, Germany, 2007. Berlin, Heidelberg: Springer Verlag, 2007: 170-180.
  • 7Dean J, Ghemawat S. MapReduce: simplified data proc- essing on large clusters[C]//Proceedings of the 6th Sym- posium on Operating System Design and Implementation (OSDI), 2004. San Francisco, California, USA: USENIX, 2004: 137-150.
  • 8You J G, Xi J Q, Zhang P J, et al. A parallel algorithm for closed cube computation[C]//Proceedings of the 7th IEEE/ACIS International Conference on Computer and Information Science (ACIS-ICIS), Portland, Oregon, USA, 2008. Washington, DC, USA: IEEE Computer Society, 2008: 95-99.
  • 9Sergey K, Yury K. Applying Map-Reduce paradigm for parallel closed cube computation[C]//Proceedings of the 1st International Conference on Advances in Databases, Knowledge, and Data Applications (DBKDA), Cancun, Mexico, 2009. Washington, DC, USA: IEEE Computer Society, 2009: 62-67.
  • 10Chen Y, Dehne F, Eavis T. Parallel ROLAP data cube construction on shared-nothing multiprocessors[J]. Dis- tributed and Parallel Databases, 2004, 15(3): 219-236.

共引文献46

同被引文献44

引证文献4

二级引证文献13

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部