期刊文献+

基于GPU的Spark大数据技术在实验室的开发应用 被引量:7

Department and Application of the GPU-based Spark Big Data Technology in Laboratory
下载PDF
导出
摘要 在大数据时代,兼顾大数据处理与高性能计算是目前对计算机系统的迫切需求。针对Spark大数据处理与基于GPU的高性能计算,分析了基于GPU的Spark技术。它主要通过构建CPU和GPU的异构并行,使计算机获得强大的计算能力,并在实验室环境下探讨了Spark-GPU技术的实现,阐述了算法实现的技术流程。在此基础上,通过仿真实验评估了Spark和Spark-GPU技术的性能。实验表明,Spark-GPU技术可以达到上百倍的加速比,这对图像处理以及信息检索等领域的发展都具有重要推动作用。 In the era of big data,both big data processing and high performance computing are of the urgent needs of a computer system.Specific to Spark big data processing and high performance computing based on GPU,this paper analyzes the Spark technology based on GPU proposed by industry.It is mainly by constructing heterogeneous parallel of CPU and GPU,making computer to obtain a powerful computing capability.Then we discuss the implementation of the Spark-GPU technology in laboratory environment,and expound the technical process of algorithm realization in detail.On this basis,we assess the performance of the Spark and Spark- GPU technology through simulation experiment.Results show Spark-GPU technology can achieve hundredfold speedup,hence,it can play an important role in promoting the development of image processing and information retrieval and other areas.
出处 《实验室研究与探索》 CAS 北大核心 2017年第1期112-116,131,共6页 Research and Exploration In Laboratory
基金 国家自然科学基金(NSFC61203273) 江苏省自然科学基金(BK20141004)
关键词 大数据处理 异构计算 图形处理器 big data processing heterogeneous computing graphics processing unit
  • 相关文献

参考文献5

二级参考文献85

  • 1张杨,诸昌钤,何太军.图形硬件通用计算技术的应用研究[J].计算机应用,2005,25(9):2192-2195. 被引量:6
  • 2卢锡城,王怀民,王戟.虚拟计算环境iVCE:概念与体系结构[J].中国科学(E辑),2006,36(10):1081-1099. 被引量:37
  • 3蒋建洪.主要分布式搜索引擎技术的研究[J].科学技术与工程,2007,7(10):2418-2424. 被引量:10
  • 4[OL].<http://hadoop.apache.org.>.
  • 5WinterCorp: 2005 TopTen Program Summary. http:// www. wintercorp, com/WhitePapers/WC TopTenWP. pdf.
  • 6TDWI Checklist Report: Big Data Analytics. http://tdwi. org/research/2010/08/Big-Data-Analytics, aspx.
  • 7Chaudhuri S, Dayal U. An overview of data warehousing and OLAP technology. SIGMOD Rec, 1997,26(1): 65-74.
  • 8Madden S, DeWitt D J, Stonebraker M. Database parallelism choices greatly impact scalability. DatabaseColumn Blog. http://www, databasecolumn, com/2007/10/database-parallelism-choices, html.
  • 9Dean J, Ghemawat S. MapReduce: Simplified data processing on large clusters//Proceedings of the 6th Symposium on Operating System Design and Implementation (OSDI ' 04). San Francisco, California, USA, 2004: 137-150.
  • 10DeWitt D J, Gerber R H, Graefe G, Heytens M L, Kumar K B, Muralikrishna M. GAMMA--A high performance dataflow database machine//Proceedings of the 12th International Conference on Very Large Data Bases (VLDB' 86). Kyoto, Japan, 1986:228-237.

共引文献650

同被引文献49

引证文献7

二级引证文献31

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部