期刊文献+

多GPU加速在高性能数值计算中的应用 被引量:2

Application of multiple GPUs on high performance numerical calculation
下载PDF
导出
摘要 针对核能领域中存在的大量数值计算问题,结合实际应用中多核硬件平台里面的多GPU(图形处理器)计算节点,提出基于CUDA(统一计算设备架构)的任务级的并行编程框架。为核电工程设计中的数值计算提供并行硬件平台下的基础GPU编程模型,将CUDA底层对多GPU的有效调度与上层使用进行分离,隔离底层的编程技术,减轻设计人员和开发人员对CUDA底层接口的使用难度;将主程序中的耗时计算模块用CUDA进行改写,再移植到GPU上执行,提升多GPU对计算任务的加速性能。实验结果表明,该编程框架能有效提升多GPU对计算任务的加速。 Aiming at the numerical calculation of physical problems in the field of nuclear energy,based on the multiple GPU cal-culation elements of multi-core hardware platforms,the programming architecture of multiple GPU development platform on task level parallelization was proposed,which provided basic parallel programming models for the nuclear engineering design proce-dure.The difficulties for the designer and the technical staff were mitigated when they used the low-level interface with CUDA, the programming framework could promote the acceleration performance of computing tasks under multiple GPUs by means of rewriting the time-consuming calculation module and transplanting them to the GPUs.Experimental results showed that the pro-gramming architecture could effectively promote the acceleration performance of computing tasks under multiple GPUs.
出处 《计算机工程与设计》 CSCD 北大核心 2014年第7期2602-2606,共5页 Computer Engineering and Design
基金 国防军工技术基础"十二五"科研基金项目(科工技[2010]1425号-41)
关键词 并行计算 并行编程 图形处理器 统一计算设备架构 数值计算 调度策略 parallel computing parallel programming GPU CUDA numerical calculation schedule policy
  • 相关文献

参考文献11

  • 1Nickolls J, Dally W J. The GPU computing era [J]. Micro, IEEE, 2010, 30 (2): 56-69.
  • 2Kindratenko V. Scientific computing with GPUs [J]. Compu- ting in Science Engineering, 2012, 14 (3): 8-9.
  • 3Hsieh C W, Chou C Y, Tsai T C, et al. NCHCs Formosa V GPU cluster enters the TOPS00 ranking [C] //International Conference on Cloud Computing Technology and Science. IEEE, 2012:622-624.
  • 4Ghorpade J, Parande J, Kulkami M, et al. Gpgpu processing in cuda architecture [J]. Advanced Computing: An Internation Journal, 2012, 3 (1): 105-120.
  • 5Garland M, Le Grand S, Nickolls J, et al. Parallel computing expe- riences with CUDA [J]. Micro, IEEE, 2008, 28 (4): 13-27.
  • 6Barnat J, Bauch P, Brim L, et al. Employing multiple cuda de- vices to accelerate ltl model checking [C] //IEEE 16th Interna- tional Conference on Parallel and Distributed Systems, 2010: 259-266.
  • 7Chen L, Villa O, Krishnamoorthy S, et al. Dynamic load balan- cing on single-and multi-GPU systems [C] //IEEE Interna- tional Symposium on Parallel I Distributed Processing, 2010: 1-12.
  • 8Nukada A, Maruyama Y, Matsuoka $. High performance 3-D FFT using multiple CUDA GPUs [C] //Proceedings of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units. ACM, 2012: 57-63.
  • 9Ioki M, Hozumi S, Chiba S. Writing a modular GPGPU pro- gram in Java [C] //Proceedings of the Workshop on Modular- ity in Systems Software. ACM, 2012: 27-32.
  • 10Sanders J, Kandrot E. CUDA by example: An introduction to general-purpose GPU programming [ M ]. Addison-Wesley Professional, 2010.

同被引文献13

引证文献2

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部