期刊文献+

基于PC集群的并行数据仓库架构 被引量:4

Parallel Data Warehouses Architecture Based on PC Cluster
下载PDF
导出
摘要 针对数据仓库规模不断增长而导致难以确保即席查询分析性能的问题,提出一种构建在PC集群上的并行数据仓库架构——HDW,采用Google的GFS和Bigtable技术进行分布式存储管理,采用MapReduce技术进行并行联机分析处理,为前台应用程序提供遵循XMLA规范的统一接口。在18个节点的集群上进行实验,结果表明,HDW系统扩展性好,能快速处理至少千万条元组的数据。 As data warehouses grow in size,how to assuring the performance of answering Ad Hoc queries on massive data becomes a big challenge.To address the issue,this paper proposes a parallel data warehouse architecture,HDW,built upon PC cluster.It employs Google s GFS,Bigtable to process the distributive storage management and MapReduce to parallelize OLAP computation tasks.In addition,it provides the XMLA interface for front-end applications.Experimental results conducted on an 18-node cluster show that HDW scales well and can process large data sets with at least 10 million tuples.
出处 《计算机工程》 CAS CSCD 北大核心 2009年第20期73-75,共3页 Computer Engineering
基金 广东省国际科技合作计划基金资助项目(2007A050100026) 广东省科技计划基金资助项目(2006B11301001) 广东省工业科技攻关计划基金资助项目(2006B80407001)
关键词 数据仓库 联机分析处理 集群 data warehouse OLAP cluster
  • 相关文献

参考文献6

  • 1DeWitt D J, Madden S, Stonebraker M. How to Build a High Performance Data Warehouse[EB/OL]. (2008-01-01). http://db.lcs. mit.edu/madden/high_perf.pdf.
  • 2Dehne F, Rau-Chaplin A, Eavis T. The PANDA Project[EB/OL]. [2008-11-13]. http://projects.cs.dal.ca/panda/.
  • 3李盛恩,王珊.封闭数据立方体技术研究[J].软件学报,2004,15(8):1165-1171. 被引量:25
  • 4Ghemawat S, Gobioff H, Leung S T. The Google File System[C]// Proc. of the 19th Symposium on Operating Systems Principles. [S.I.]: ACM Press, 2003.
  • 5Dean J, Ghemawat S. MapReduce: Simplified Data Processing on Large Clusters[C]//Proc. of the 6th Symposium on Operating Systems Design and Implementation. San Francisco, CA, USA: [s. n.], 2004.
  • 6Chang F, Dean J, Ghemawat S, et al. BigTable: A Distributed Storage System for Structured Data[C]//Proc. of the 7th Symposium on Operating Systems Design and Implementation. Seattle, WA, USA: [s. n.], 2006.

二级参考文献13

  • 1Lakshmanan LVS, Pei J, Han JW. Quotient cube: How to summarize the semantics of a data cube. In: Bressan S, Chaudhri AB, Lee ML, Yu JX, Lacroix Z, eds. Proc. of the 23rd Int'l Conf. on Very Large Data Bases. Hong Kong: Morgan Kaufmann, 2002. 778~789.
  • 2Sismanis Y, Deligiannakis A, Roussopoulos N, Kotidis Y. Dwarf: Shrinking the PetaCube. In: Franklin MJ, Moon B, Ailamaki A, eds. Proc. of the 2002 ACM SIGMOD Int'l Conf. on Management of Data. Madison: ACM Press, 2002. 464~475.
  • 3Mumick IS, Quass D, Mumick BS. Maintenance of data cubes and summary tables in a warehouse. In: Peckham J, ed. Proc. of the ACM SIGMOD Int'l Conf. on Management of Data. Tucson: ACM Press, 1997. 100-111.
  • 4Hahn C, Warren S, London J. Edited synoptic cloud reports from ships and land stations over the globe. 1996. http://cdiac.esd.ornl.gov/cdiac/ndps/ndp026b.html
  • 5Gray J, Bosworth A, Layman A, Pirahesh H. Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-totals. In: Su SYW, ed. Proc. of the 12th Int'l Conf. on Data Engineering. New Orleans: IEEE Computer Society, 1996. 152~159.
  • 6Agarwal S, Agrawal R, Deshpande PM, Gupta A, Naughton JF, Ramarkrishman R, Sarawagi S. On the computation of multidimensional aggregates. In: Vijayaraman TM, Buchmann AP, Mohan C, Sarda NL, eds. Proc. of the 22nd Int'l Conf. on Very Large Data Bases. Mumb
  • 7Zhao Y, Deshpande PM, Naughton JF. An array-based algorithm for simultaneous multidimensional. In: Peckham J, ed. Proc. of the ACM SIGMOD Int'l Conf. on Management of Data. Tucson: ACM Press, 1997. 159-170.
  • 8Ross KA, Srivastava D. Fast computation of sparse datacubes. In: Jarke M, Carey MJ, Dittrich KR, Lochovsky FH, Loucopoulos P, Jeusfeld MA, eds. Proc. of the 23rd Int'l Conf. on Very Large Data Bases. Athens: Morgan Kaufmann, 1997. 116~125.
  • 9Harinarayan V, Rajaraman A, Ullman JD. Implementing data cubes efficiently. In: Jagadish HV, Mumick IS, eds. Proc. of the 1996 ACM SIGMOD Int'l Conf. on Management of Data. Montreal: ACM Press, 1996. 205-216.
  • 10Shukla A, Deshpande PM, Naughton JF. Materialized view selection for multidimensional datasets. In: Gupta A, Shmueli O, Widom J, eds. Proc. of the 24th Int'l Conf. on Very Large Data Base. New York: Morgan Kaufmann, 1998. 488~499.

共引文献24

同被引文献37

引证文献4

二级引证文献58

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部