一种基于OPENACC指令的加速体验

A OPENACC Instruction of Accelerated Experiences

下载PDF

导出

摘要随着越来越多的公司和企业使用GPU来作为加速计算设备,对并行程序的需求也越来越大,目前我们一般都使用CUDA或OPENCL等底层API进程序开发,但是使用这些底层API来进行开发效率都不高,目前出现OPENACC指令就是针对这个问题提出来的,在该文里,我们针对高斯模糊算法,分别使用CPU,OPENACC,CUDA进行实现,比较他们的效率,发现在虽然OPENACC相对于CUDA性能要低一些,但相对其陡峭的学习曲线和低下的开发效率,OPENACC有着不错的性价比,而且随着编译器和硬件技术的发展,OPENACC有着广阔的发展空间。 Computer systems become increasingly widespread use of GPGPU devices to speed up the calculation, but, using the low-level API for accelerated calculation is Cumbersome and inefficient To solve this problem, Based on the instruction of the higher level of abstraction for programming to solve this problem, in this article, Gaussian blur algorithm experience OPENACC just usage OPENACC through instruction in C or C ＋＋ in the calculation close code transfer to GUP in conduct, we compare the respectively use the CPU, OPENACC, CUDA achieve Gaussian mode Lake algorithm performance performance. Although CUDA have a good performance, but relative to the low efficiency of the development and the steep learning curve, OPENACC used only a dozen lines of code can achieve significant speedup, and with the development of the compiler and hard- ware technology, the command guidance compile getting more the larger development space.

作者胡玉贵 HU Yu-gui （College of Computer Science,Guangdong Institute of Science and Technology, Zhuhai 519090,China）

机构地区广东科学技术职业学院计算机学院

出处《电脑知识与技术》 2012年第12期8248-8250,共3页 Computer Knowledge and Technology

基金广东省自然科学基金项目（S2011010002537）

关键词 OPENACC CUDA GPGPU 卷积 OPENACC CUDA GPGPU CONVOLUTION

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献5

1Bordawekar R, Bondhugula U, Rao R. Can CPUs Match GPUs on Performance with Productivity.: Experiences with Optimizing a FLOP-intensive Application on CPUs and GPU. Technical report, IBM Res. Division, 2010.
2Brecher C, Gorgels C, Hardjosuwito A. Simulation based ToolWear Analysis in Bevel Gear Cutting. International Conference on Gears,volume 2108.2 of VDI-Berichte, pages 1381 - 1384,usseldorf, 2010. VDI Verlag.
3M. B 'ucker, R. Beucker,and A. Rupp. Parallel Minimum p-Norm Solution of the Neuromagnetic Inverse Problem for Realistic Signals Us-ing Exact Hessian-Vector Products. SIAM Journal on Scientific Computing,2008,30(6):2905-2921.
4Dolbeau R,Bihan S, Bodin F. HMPP: A Hybrid Multi-core Parallel Programming Evironment. In First Workshop on General Purpose Pro-cessing on Graphics Processing Units, 2007.
5CAPS Enterprise, Cray Inc., NVIDIA, and the Portland Group. The OpenACC Application Programming Interface, vl.0, 2011.

1刘黎明,张晓民.在线考试系统中USB端口的监控与管理[J].电脑知识与技术,2009,5(9X):7822-7824. 被引量：1
2API改变游戏 DriectX12引领硬件市场新变局[J].电脑迷,2015,0(6):28-29.
3左伟,李礴.XML在文本加工中的应用和实现[J].中国传媒大学学报（自然科学版）,2006,13(4):58-61.
4左伟,李礴.XML在文本加工中的应用和实现[J].农业网络信息,2006(4):108-110.
5阿曼达.下副本更轻松玩游戏再不卡——AMD Mantle API那些事儿[J].微型计算机,2014,0(35):79-81.
6朱玉君,黄媛媛,颜家康.基于DirectX的三维地形绘制及三维拾取方法的研究[J].科协论坛（下半月）,2013(6):101-102. 被引量：1
7安恒信息推出明鉴数据库弱点扫描器升级版[J].信息安全与通信保密,2015,0(9):70-70.
8王舒宜,陈汉勇,李刚,廖琪梅,卢虹冰.三维可视化技术在医学领域中的应用[J].现代电子技术,2009,32(10):91-93. 被引量：4
9郭丽红,涂平华.汽车空调通用平台控制策略研究[J].计算机应用与软件,2013,30(8):287-290. 被引量：2
10黄灿,张大禹,湛志兵.基于Visual C++的Winsock API研究[J].舰船电子工程,2009,29(6):153-156. 被引量：2

电脑知识与技术

2012年第12期

浏览历史

内容加载中请稍等...

一种基于OPENACC指令的加速体验

参考文献5

相关作者

相关机构

相关主题

浏览历史