期刊文献+

Efficient parallel implementation of the lattice Boltzmann method on large clusters of graphic processing units 被引量:6

Efficient parallel implementation of the lattice Boltzmann method on large clusters of graphic processing units
原文传递
导出
摘要 Many-core processors, such as graphic processing units (GPUs), are promising platforms for intrinsic parallel algorithms such as the lattice Boltzmann method (LBM). Although tremendous speedup has been obtained on a single GPU compared with mainstream CPUs, the performance of the LBM for multiple GPUs has not been studied extensively and systematically. In this article, we carry out LBM simulation on a GPU cluster with many nodes, each having multiple Fermi GPUs. Asynchronous execution with CUDA stream functions, OpenMP and non-blocking MPI communication are incorporated to improve efficiency. The algorithm is tested for two-dimensional Couette flow and the results are in good agreement with the analytical solution. For both the oneand two-dimensional decomposition of space, the algorithm performs well as most of the communication time is hidden. Direct numerical simulation of a two-dimensional gas-solid suspension containing more than one million solid particles and one billion gas lattice cells demonstrates the potential of this algorithm in large-scale engineering applications. The algorithm can be directly extended to the three-dimensional decomposition of space and other modeling methods including explicit grid-based methods. Many-core processors, such as graphic processing units (GPUs), are promising platforms for intrinsic parallel algorithms such as the lattice Boltzmann method (LBM). Although tremendous speedup has been obtained on a single GPU compared with main- stream CPUs, the performance of the LBM for multiple GPUs has not been studied extensively and systematically. In this article, we carry out LBM simulation on a GPU cluster with many nodes, each having multiple Fermi GPUs. Asynchronous execution with CUDA stream functions, OpenMP and non-blocking MPI communication are incorporated to improve efficiency. The algo- rithm is tested for two-dimensional Couette flow and the results are in good agreement with the analytical solution. For both the one- and two-dimensional decomposition of space, the algorithm performs well as most of the communication time is hidden. Direct numerical simulation of a two-dimensional gas-solid suspension containing more than one million solid particles and one billion gas lattice cells demonstrates the potential of this algorithm in large-scale engineering applications. The algorithm can be directly extended to the three-dimensional decomposition of space and other modeling methods including explicit grid-based methods.
出处 《Chinese Science Bulletin》 SCIE EI CAS 2012年第7期707-715,共9页
基金 supported by the National Natural Science Foundation of China (20221603 and 20906091)
关键词 格子BOLTZMANN方法 图形处理单元 并行算法 集群 COUETTE流 LBM模拟 OPENMP 直接数值模拟 asynchronous execution, compute unified device architecture, graphic processing unit, lattice Boltzmann method,non-blocking message passing interface, OpenMP
  • 相关文献

参考文献4

二级参考文献47

  • 1CHEN FeiGuo,GE Wei,LI JingHai.Molecular dynamics simulation of complex multiphase flow on a computer cluster with GPUs[J].Science China Chemistry,2009,52(3):372-380. 被引量:9
  • 2肖曼玉,欧阳洁,李永刚.基于区域分解的并行SIMPLER算法研究[J].应用基础与工程科学学报,2006,14(3):316-323. 被引量:5
  • 3Chen S Y and Doolen G D .1998 Annu Rev Fluid Mech 30 32.
  • 4Luo L S .2000 Proc Int Conl Appl CFD (Beijing) p52.
  • 5Guo Z L, Zheng C G, Li Q and Wang N C .2002 Lattice Boltzmann Method for Hydrodynamics (Wuhan: Hubei Science & Technology Press) (in Chinese).
  • 6He X and Luo L S .1997 J State Phys 88 927.
  • 7Lin Z, Fang H and Tao R .1997 Phys Rev E 54 6323.
  • 8Zou Q, Hou S, Chen S and Doolen G .1995 J State Phys 81 35.
  • 9Guo Z L, Shi B C and Wang N C .2000 J Comput Phys 165 288.
  • 10Shi B C, Guo Z L and Wang N C .2002 Chin Phys Lett 19 515.

共引文献23

同被引文献26

引证文献6

二级引证文献15

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部