期刊文献+

一种基于OPENACC指令的加速体验

A OPENACC Instruction of Accelerated Experiences
下载PDF
导出
摘要 随着越来越多的公司和企业使用GPU来作为加速计算设备,对并行程序的需求也越来越大,目前我们一般都使用CUDA或OPENCL等底层API进程序开发,但是使用这些底层API来进行开发效率都不高,目前出现OPENACC指令就是针对这个问题提出来的,在该文里,我们针对高斯模糊算法,分别使用CPU,OPENACC,CUDA进行实现,比较他们的效率,发现在虽然OPENACC相对于CUDA性能要低一些,但相对其陡峭的学习曲线和低下的开发效率,OPENACC有着不错的性价比,而且随着编译器和硬件技术的发展,OPENACC有着广阔的发展空间。 Computer systems become increasingly widespread use of GPGPU devices to speed up the calculation, but, using the low-level API for accelerated calculation is Cumbersome and inefficient To solve this problem, Based on the instruction of the higher level of abstraction for programming to solve this problem, in this article, Gaussian blur algorithm experience OPENACC just usage OPENACC through instruction in C or C + + in the calculation close code transfer to GUP in conduct, we compare the respectively use the CPU, OPENACC, CUDA achieve Gaussian mode Lake algorithm performance performance. Although CUDA have a good performance, but relative to the low efficiency of the development and the steep learning curve, OPENACC used only a dozen lines of code can achieve significant speedup, and with the development of the compiler and hard- ware technology, the command guidance compile getting more the larger development space.
作者 胡玉贵 HU Yu-gui (College of Computer Science,Guangdong Institute of Science and Technology, Zhuhai 519090,China)
出处 《电脑知识与技术》 2012年第12期8248-8250,共3页 Computer Knowledge and Technology
基金 广东省自然科学基金项目(S2011010002537)
关键词 OPENACC CUDA GPGPU 卷积 OPENACC CUDA GPGPU CONVOLUTION
  • 相关文献

参考文献5

  • 1Bordawekar R, Bondhugula U, Rao R. Can CPUs Match GPUs on Performance with Productivity.: Experiences with Optimizing a FLOP-intensive Application on CPUs and GPU. Technical report, IBM Res. Division, 2010.
  • 2Brecher C, Gorgels C, Hardjosuwito A. Simulation based ToolWear Analysis in Bevel Gear Cutting. International Conference on Gears,volume 2108.2 of VDI-Berichte, pages 1381 - 1384,usseldorf, 2010. VDI Verlag.
  • 3M. B 'ucker, R. Beucker,and A. Rupp. Parallel Minimum p-Norm Solution of the Neuromagnetic Inverse Problem for Realistic Signals Us-ing Exact Hessian-Vector Products. SIAM Journal on Scientific Computing,2008,30(6):2905-2921.
  • 4Dolbeau R,Bihan S, Bodin F. HMPP: A Hybrid Multi-core Parallel Programming Evironment. In First Workshop on General Purpose Pro-cessing on Graphics Processing Units, 2007.
  • 5CAPS Enterprise, Cray Inc., NVIDIA, and the Portland Group. The OpenACC Application Programming Interface, vl.0, 2011.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部