摘要
针对GPU上应用开发移植困难的问题,提出了一种串行计算源程序到并行计算源程序的映射方法。该方法从串行源程序中获得可并行化循环的层次信息,建立循环体结构与GPU线程的对应关系,生成GPU端核心函数代码;根据变量引用读写属性生成CPU端控制代码。基于该方法实现了一个编译原型系统,完成了C语言源程序到CUDA源程序的自动生成。对原型系统在功能和性能方面的测试结果表明,该系统生成的CUDA源程序与C语言源程序在功能上一致,其性能有显著提高,在一定程度上解决了计算密集型应用向CPU-GPU异构多核系统移植困难的问题。
Aiming at the developing and porting difficulties of GPU-based applications, a mapping approach is proposed, which converts serial computing source code into equivalent parallel computing source code. This approach acquires hier-archies of parallelizable loops from serial sources, establishes the correspondence between loop structures and GPU threads, and generates the core function code for GPU. Meanwhile, CPU control code is generated according to read/write attributes of variable references. A compiler prototype is implemented based on this approach, which translates C code into CUDA code automatically. Functionality and performance evaluations of the prototype show that the CUDA code generated is functionally equivalent to the original C code, with significant improvement in performance, thus overcomes the diffi-culty in porting compute-intensive applications to CPU-GPU heterogeneous systems.
出处
《计算机工程与应用》
CSCD
北大核心
2015年第21期41-47,共7页
Computer Engineering and Applications
基金
国家自然科学基金(No.61173039)
青年基金项目(No.61202041)
国家高技术研究发展计划(863)(No.2012AA010904
No.2012AA01A306)
深圳市科技计划(No.JCYJ20120615101127404)