摘要
封闭立方体是联机分析处理中一种有效的数据立方体压缩技术,但封闭立方体的并行算法目前很少有相关文献研究。提出了一种简单而实用的解决方案,即基于MapReduce计算框架,在非共享内存的PC集群上对封闭立方体进行分布式的预计算和查询。相关实验表明,本方法能快速处理千万级的数据,具有较好的线性加速比,而且能够更大地压缩数据立方体存储空间。
The closed cube is a very efficient technology for the data cube compression in OLAP, but its parallel algorithm is little studied in the literature. This paper presented a solution that is easy to be implemented and applied. It parallelizes the closed cube construction and the query answering based on the MapReduce framework over low cost share nothing PC clusters. The experiments show that our approach can rapidly process huge data sets with at least 10 million rows and get pretty good linear speedup. Further, it compresses data cubes much greater.
出处
《计算机科学》
CSCD
北大核心
2009年第6期153-155,161,共4页
Computer Science
基金
广东省国际科技合作计划项目(2007A050100026)
广东省科技计划项目(2006B11301001)
广东省工业科技攻关计划项目(2006B80407001)资助