期刊文献+

基于并行技术的大数据量统计分析探讨 被引量:8

PROBING PARALLEL TECHNIQUE-BASED STATISTICAL ANALYSIS FOR ENORMOUS DATA
下载PDF
导出
摘要 当前,企业有着对海量信息数据进行统计分析的迫切需求。面对海量的数据,如何高效地得到统计结果,是分析过程中一个很重要的环节。在分析了当前出现的大数据量处理方法的基础上,进行了比较。得到了并行计算架构的数据库是解决此问题的最佳手段,并且进行了性能测试,得到了对比结果。相信对从事相关研究的同行有着一定的参考价值。 At present,enterprises have urgent needs to conduct statistical analysis on enormous information data.Facing up the enormous data,how to efficiently obtain statistical results is a very important link in analysis process.On the basis of analysing current methods of processing enormous data,the author made the comparison on them.The conclusion derived was that the database with parallel computing architecture is the best means to resolve the issue.A performance test was conducted and the comparison outcomes were got.It is believed that this study has certain reference value to the counterparts engaging in related researches.
机构地区 上海交通大学
出处 《计算机应用与软件》 CSCD 2011年第3期162-165,共4页 Computer Applications and Software
关键词 MAPREDUCE 并行数据库 SQL Greenplum MapReduce Parallel Database SQL Greenplum
  • 相关文献

参考文献6

  • 1Dean J, Ghemawat S. MapReduce : Simplified data processing on large clusters[C]//Proc. OSDI, 2004.
  • 2Fay Chang, Jeffrey Dean, Sanjay Ghemawat,et al. Wallach Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber. Bigtable: A Distributed Storage System for Structured Data.
  • 3Patrick V. Parallel database systems: Open problems and new issues [ J ]. Distributed and Parallel Databases, 1993,2 ( 1 ) : 137 - 165.
  • 4Wemer Mach, Erich Schikuta. Parallel Database Sort and Join Operations Revisited on Grids. 2007.
  • 5Jacky. Greenplum技术浅析[ EB/OL]. 2009-07. http://www, hellodba, net/2009/07/greenplum, html.
  • 6Greenplum. GP 3.2 Training Guide, 2009.

同被引文献54

引证文献8

二级引证文献59

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部