期刊文献+

基于通用图形处理器的GRAPES长波辐射并行方案 被引量:5

GPGPU Accelerated Massive Parallel Design of Long Wave Radiation Process in GRAPES-Global Model
下载PDF
导出
摘要 随着通用图形处理器(GPGPU)计算技术的快速发展,通过大规模增加处理系统的并发度来提升性能成为计算机高性能计算的最新趋势。目前,通用图形处理器已经被应用到科学计算的诸多领域。长波辐射作为GRAPES模式中极为重要的物理过程,其巨大的计算量对整个GRAPES模式的运行效率有重要影响。该文依托NVIDIA公司计算统一设备架构(CUDA)技术平台,以GRAPES全球模式中长波辐射传输方案为例,对其进行了大规模并发设计和优化,在保持系统结果一致的前提下,对比单颗高端CPU,Tesla C1060 GPGPU具有11倍的加速效果,明显提升了GRAPES全球模式的执行速度和预报时效。研究表明:使用通用图形处理器技术提升数值预报模式的执行速度非常有潜力。 In recent years,with the rapid advance of GPGPU(General Purpose Graphic Processing Unit) technology, leveraging the massive parallel processing power of GPGPU to provide super-computing capacity becomes a new trend.At present,GPGPU has been applied to scientific calculations of many fields. GRAPES(Global/Regional Assimilation and PrEdictions System) is the new-generation multi-scale numerical model,which is developed by Chinese Academy of Meteorological Sciences and plays an important role in weather forecasting and research.Long wave radiation process is one of the most important physical processes in GRAPES_Global model and occupies a lot of processing time,affecting the whole model’s computing efficiency.Since this process could be partitioned into different tiles within the horizontal plane, a naturally parallel scheme could be carried out. A GPU has hundreds of stream processors within one chip,which enables it to handle thousands of hardware threads simultaneously,and gives much higher theoretical throughput:Over 1 TFlops by one chip.GPU also has a whole integration of supporting tool sets,from compiler to libraries,which could facilitate the development.Considering the characteristics of the long wave radiation computing process, keeping the high level MPI communication the same,a low-level fine-grained parallel architecture is designed to harness the computing power of the new hardware.This massive parallel processing implementation is based on NVIDIA GPGPU and CUDA technology.Other than looping through a big portion of the atmosphere columns within conventional CPU-based systems,the new GPU-based implementation uses each small core to process a single column.This scheme has three major advantages,including much higher thread concurrence,using bigger band width of GPU memory,denser computing intensity and better efficiency. Experiments with real dataset are performed and the correctness of the new design is validated, which show that Tesla C1060 has an 11x speedup compared to a high-end x86 CPU,greatly improving the execution speed and forecast efficiency.Timing on sub-routines and data transfer time are also recorded and compared.Different partition configurations are carried out to get the best combination.Also,the overlapping of execution and data transfer is used to hide the latency.The experiment shows GPGPU has good potential to improve numerical weather forecasting models.With more and more routines ported to GPU systems,a much better speedup could be achieved over the whole model.
出处 《应用气象学报》 CSCD 北大核心 2012年第3期348-354,共7页 Journal of Applied Meteorological Science
基金 国家高技术研究发展计划(2009AA01A138)
关键词 通用图形处理器 数值天气预报模式 长波辐射 GPGPU numerical weather forecasting model long-wave radiation
  • 相关文献

参考文献16

  • 1伍湘君,金之雁,黄丽萍,陈德辉.GRAPES模式软件框架与实现[J].应用气象学报,2005,16(4):539-546. 被引量:65
  • 2陈德辉,沈学顺.新一代数值预报系统GRAPES研究进展[J].应用气象学报,2006,17(6):773-777. 被引量:157
  • 3David Kanter. NVIDIA's GT200: Inside a Parallel Processor. [2011-06 -03]. http://www, realworldtech, corn/page, efm? Ar- ticleID= RWT090808195242.
  • 4Michalakes J, Hacker J, Loft R, et al. WRF Nature Runff Proceedings of the 2007 ACM/IEEE conference on Supercom puting,2007 : 1- 6.
  • 5Michalakes John, Vachharajani Manish. GPGPU Acceleration of Numerical Weather Prediction. [2011-06-12]. http://cuda, es- dn. net/showcase, html.
  • 6Govett Mark. Using GPUs to Run Weather Prediction Mod- els. 14th ECMWF Workshop on High Performance Compu- ting in Meteorology, 2010.
  • 7Henderson Tom. Progress on GPGPU Parallelization of the NIM Prototype Numerical Weather Prediction Dynamical Core. 14th ECMWF Workshop on High Performance Compu ting in Meteorology. 2010.
  • 8Ruetsch Greg, Phillips Everett, Massimiliano Fattca. GPG- PU Acceleration of the Long-wave Rapid Radiative Transfer Model in WRF Using CUDA Fortran.[2011-06-09]. http:// www. pgroup, com/resources/accel files/ index, htm.
  • 9NV1DIA. CUDA_C_Programming_Guide. [2010- 6- 15]. http: /,/developer. nvidia, com/cuda-toolkit-40.
  • 10NVIDIA. Fermi Compute Architecture Whitepaper.[2011 -06- 19]. http: //www. nvidia, com/content/PDF/fermi_white papers/NVIDIA Fermi_Compute_ Archit ecture_Whit epaper. pdf,.

二级参考文献38

共引文献216

同被引文献164

引证文献5

二级引证文献35

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部