摘要
在大数据时代,兼顾大数据处理与高性能计算是目前对计算机系统的迫切需求。针对Spark大数据处理与基于GPU的高性能计算,分析了基于GPU的Spark技术。它主要通过构建CPU和GPU的异构并行,使计算机获得强大的计算能力,并在实验室环境下探讨了Spark-GPU技术的实现,阐述了算法实现的技术流程。在此基础上,通过仿真实验评估了Spark和Spark-GPU技术的性能。实验表明,Spark-GPU技术可以达到上百倍的加速比,这对图像处理以及信息检索等领域的发展都具有重要推动作用。
In the era of big data,both big data processing and high performance computing are of the urgent needs of a computer system.Specific to Spark big data processing and high performance computing based on GPU,this paper analyzes the Spark technology based on GPU proposed by industry.It is mainly by constructing heterogeneous parallel of CPU and GPU,making computer to obtain a powerful computing capability.Then we discuss the implementation of the Spark-GPU technology in laboratory environment,and expound the technical process of algorithm realization in detail.On this basis,we assess the performance of the Spark and Spark- GPU technology through simulation experiment.Results show Spark-GPU technology can achieve hundredfold speedup,hence,it can play an important role in promoting the development of image processing and information retrieval and other areas.
出处
《实验室研究与探索》
CAS
北大核心
2017年第1期112-116,131,共6页
Research and Exploration In Laboratory
基金
国家自然科学基金(NSFC61203273)
江苏省自然科学基金(BK20141004)
关键词
大数据处理
异构计算
图形处理器
big data processing
heterogeneous computing
graphics processing unit