Providing Source Code Level Portability Between CPU and GPU with MapCG

Providing Source Code Level Portability Between CPU and GPU with MapCG

导出

摘要 Graphics processing units （GPU） have taken an important role in the general purpose computing market in recent years. At present, the common approach to programming GPU units is to write CPU specific code with low level GPU APIs such as CUDA. Although this approach can achieve good performance, it creates serious portability issues as programmers are required to write a specific version of the code for each potential target architecture. This results in high development and maintenance costs. We believe it is desirable to have a programming model which provides source code portability between CPUs and GPUs, as well as different GPUs. This would allow programmers to write one version of the code, which can be compiled and executed on either CPUs or GPUs efficiently without modification. In this paper, we propose MapCG, a MapReduce framework to provide source code level portability between CPUs and GPUs. In contrast to other approaches such as OpenCL, our framework, based on MapReduce, provides a high level programming model and makes programming much easier. We describe the design of MapCG, including the MapReduce-style high-level programming framework and the runtime system on the CPU and GPU. A prototype of the MapCG runtime, supporting multi-core CPUs and NVIDIA GPUs, was implemented. Our experimental results show that this implementation can execute the same source code efficiently on multi-core CPU platforms and GPUs, achieving an average speedup of 1.6-2.5x over previous implementations of MapReduce on eight commonly used applications. Graphics processing units （GPU） have taken an important role in the general purpose computing market in recent years. At present, the common approach to programming GPU units is to write CPU specific code with low level GPU APIs such as CUDA. Although this approach can achieve good performance, it creates serious portability issues as programmers are required to write a specific version of the code for each potential target architecture. This results in high development and maintenance costs. We believe it is desirable to have a programming model which provides source code portability between CPUs and GPUs, as well as different GPUs. This would allow programmers to write one version of the code, which can be compiled and executed on either CPUs or GPUs efficiently without modification. In this paper, we propose MapCG, a MapReduce framework to provide source code level portability between CPUs and GPUs. In contrast to other approaches such as OpenCL, our framework, based on MapReduce, provides a high level programming model and makes programming much easier. We describe the design of MapCG, including the MapReduce-style high-level programming framework and the runtime system on the CPU and GPU. A prototype of the MapCG runtime, supporting multi-core CPUs and NVIDIA GPUs, was implemented. Our experimental results show that this implementation can execute the same source code efficiently on multi-core CPU platforms and GPUs, achieving an average speedup of 1.6-2.5x over previous implementations of MapReduce on eight commonly used applications.

作者 Chun-Tao Hong De-Hao Chen Yu-Bei Chen Wen-Guang Chen Wei-Min Zheng Hai-Bo Lin 洪春涛;陈德颢;陈羽北;陈文光;郑纬民;林海波(Department of Computer Science and Technology,Tsinghua University,Beijing 100084,China;Department of Electronic Engineering,Tsinghua University,Beijing 100084,China;IBM China Research Lab,Beijing 100094,China)

机构地区 Department of Computer Science and Technology Department of Electronic Engineering IBM China Research Lab

出处《Journal of Computer Science & Technology》 SCIE EI CSCD 2012年第1期42-56,共15页 计算机科学技术学报（英文版）

基金 supported by the National Natural Science Foundation of China under Grant No. 60973143 the National High Technology Research and Development 863 Program of China under Grant No. 2008AA01A201 the National Basic Research 973 Program of China under Grant No. 2007CB310900

关键词 PORTABILITY PARALLEL GPU programming portability, parallel, GPU programming

分类号 TP332 [自动化与计算机技术—计算机系统结构] TP391.41 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献30

1NVIDIA. NVIDIA CUDA compute unified device architecture programming guide, http://developer.dounload.nvidia.com/ compute/cuda/1-1/NVIDIA_CUDA_programming_Guide_l.1. pdf, 2007.
2Eichenberger A E, O'Brien J K, O'Brien K Met al. Using advanced compiler technology to exploit the performance of the Cell Broadband EngineTM architecture. IBM Systems Journal, 2006, 45(1): 59-84.
3Zhu W R, Sreedhar V C, Hu Z, Gao G R. Synchronization state buffer: Supporting efficient fine-grain synchronization on many-core architectures. In Proc. the 34th ISCA, June 2007, pp.35-45.
4Buck I, Foley T, Horn D et al. Brook for GPUs: Stream computing on graphics hardware. ACM Trans. Graph., 2004, 23(3): 777-786.
5Khronos Group. OpenCL specification, http://www.khronos. org/registry/cl/.
6Stratton J, Stone S S, Hwu W M. MCUDA: An efficient im- plementation of CUDA kernels for multi-core CPUs. In Proc. the 21th LCPC, Julv 31-Aug. 2, 2008, DO.16-30.
7He B S, Fang W B, Luo Q, Govindaraju N K, Wang T. Mars: A mapreduce framework on graphics processors. In Proc. the 17th PACT, Oct. 2008, pp.260-269.
8Ranger C, Raghuraman R, Penmetsa A, Bradski G, Kozyrakis C. Evaluating mapreduce for multi-core and multiprocessor systems. In Proc. the 13th HPCA, Feb. 2007, pp.13-24.
9Berger E D, McKinley K S, Blumofe R D, Wilson P R. Hoard: A scalable memory allocator for multithreaded applications. SIGPLAN Not., 2000, 35(11): 117-128.
10Dean J, Ghemawat S. MapReduce: Simplified data process- ing on large clusters. In Proc. the 6th OSDI, Dec. 2004, pp.137-150.

1袁怡男.Haswe II能否改变PC世界？[J].微型计算机,2013(19):27-30.
2王宏.虚拟计算市场进入大幅增长期——iForum虚拟计算协同峰会在京召开[J].金融电子化,2010(9):97-97.
3曙光引领双核应用潮流[J].中国计算机用户,2005(47):57-57.
4袁楚.MeeGo结盟Atom:创造移动计算新时代?[J].互联网天地,2010(5):69-70.
5张贝贝.构建超算云平台要靠云应用[J].软件和信息服务,2013(4):32-32.
6曙光引领双核应用潮流[J].中国计算机用户,2005(49):36-36.
7Paul Marino.借力Android征战移动计算市场[J].集成电路应用,2012(1):12-13.
8上官远方.移动计算局变[J].消费电子,2009(6):48-49.
9胡泳.开源软件[J].商务周刊,2011(2):77-77. 被引量：3
10马路遥.灵活运用SRPM[J].开放系统世界,2005(2):62-62.

Journal of Computer Science & Technology

2012年第1期

浏览历史

内容加载中请稍等...

Providing Source Code Level Portability Between CPU and GPU with MapCG

参考文献30

相关作者

相关机构

相关主题

浏览历史