基于GPU的LBM迁移模块算法优化

GPU-based Algorithm Optimization for Streaming Module of Lattice Boltzmann Method

下载PDF

导出

摘要格子玻尔兹曼方法(LBM)是一种基于介观模拟尺度的计算流体力学方法,其在计算时设置大量的离散格点,具有适合并行的特性。图形处理器(GPU)中有大量的算术逻辑单元,适合大规模的并行计算。基于GPU设计LBM的并行算法,能够提高计算效率。但是LBM算法迁移模块中每个格点的计算都需要与其他格点进行通信,存在较强的数据依赖。提出一种基于GPU的LBM迁移模块算法优化策略。首先分析迁移部分的实现逻辑,通过模型降维,将三维模型按照速度分量离散为多个二维模型,降低模型的复杂度;然后分析迁移模块计算前后格点中的数据差异,通过数据定位找到迁移模块的通信规律,并对格点之间的数据交换方式进行分类;最后使用分类的交换方式对离散的二维模型进行区域划分,设计新的数据通信方式,由此消除数据依赖的影响,将迁移模块完全并行化。对并行算法进行测试,结果显示:该算法在1.3×10^(8)规模网格下能达到1.92的加速比,表明算法具有良好的并行效果;同时对比未将迁移模块并行化的算法,所提优化策略能提升算法30%的并行计算效率。 The Lattice Boltzmann Method(LBM)is a Computational Fluid Dynamics(CFD)method based on a mesoscopic simulation scale.A large number of discrete lattice points suitable for parallelism are set during the calculation.Several arithmetic logic units in a Graphics Processing Unit(GPU)are suitable for large-scale parallel computing.The design of a GPU-based LBM parallel algorithm can improve the computational efficiency of the algorithm.However,the calculation of each lattice point in the streaming module of the LBM algorithm requires communication with other lattice points that have strong data dependence.In this study,a GPU-based optimization strategy for an LBM streaming module is proposed.First,the implementation logic of the migration part is analyzed in detail,and a three-dimensional model is discretized into several two-dimensional models according to the velocity component through model dimension reduction,which reduces the complexity of the model.Second,the data differences in the lattice points before and after the streaming module calculation are analyzed,the communication rules of the streaming module are determined through data positioning,and the data exchange modes between the lattice points are classified.The discrete two-dimensional model is thereafter divided into regions using a classified exchange mode,and a new data communication mode is designed.Finally,the influence of data dependence is successfully eliminated and the streaming module is completely parallel.The parallel algorithm is tested,and an acceleration ratio of 1.92 times is achieved under 1.3×10^(8) grids,which shows that the algorithm has a good parallel effect.Meanwhile,compared with an algorithm that does not parallelize the streaming module,the optimization strategy in this study can improve the parallel computing efficiency of the algorithm by 30%.

作者黄斌柳安军潘景山田敏张煜朱光慧 HUANG Bin;LIU Anjun;PAN Jingshan;TIAN Min;ZHANG Yu;ZHU Guanghui(Shandong Computer Science Center(National Supercomputer Center in Jinan),Qilu University of Technology(Shandong Academy of Sciences),Jinan 251013,Shandong,China;High Performance Computing Laboratory,Jinan Institute of Supercomputer Technology,Jinan 251013,Shandong,China;School of Energy Science and Engineering,Harbin Institute of Technology,Harbin 150001,Heilongjiang,China)

机构地区齐鲁工业大学(山东省科学院)山东省计算中心(国家超级计算济南中心) 济南超级计算技术研究院高性能计算实验室哈尔滨工业大学能源科学与工程学院

出处《计算机工程》 CAS CSCD 北大核心 2024年第2期232-238,共7页 Computer Engineering

基金国家自然科学基金(62002186) 山东省重点研发计划项目(2021RZB01002)。

关键词高性能计算格子玻尔兹曼方法图形处理器并行优化数据重排 High Performance Computing(HPC) Lattice Boltzmann Method(LBM) Graphics Processing Unit(GPU) parallel optimization data rearrangement

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献3

1王利民,付少童.颗粒流体系统的格子Boltzmann数值方法研究进展[J].计算力学学报,2022,39(3):332-340. 被引量：1
2李博,黄东强,贾金芳,吴利,王晓英,黄建强.基于CPU与GPU的异构模板计算优化研究[J].计算机工程,2023,49(4):131-137. 被引量：1
3Seiya Watanabe,Changhong Hu.Lattice Boltzmann simulations for multiple tidal turbines using actuator line model[J].Journal of Hydrodynamics,2022,34(3):372-381. 被引量：1

二级参考文献12

1Limin Wang,Guofeng Zhou,Xiaowei Wang,Qingang xiong,Wei Ge.Direct numerical simulation of particle-fluid systems by combining time-driven hard-sphere model and lattice Boltzmann method[J].Particuology,2010,8(4):379-382. 被引量：12
2周国峰,王利民,王小伟,熊勤钢,葛蔚.基于时驱硬球算法与格子玻尔兹曼方法的颗粒流体系统直接数值模拟[J].科学通报,2011,56(16):1246-1256. 被引量：6
3张博,王利民,王小伟,张现仁,葛蔚,李静海.基于格子玻尔兹曼方法的单孔射流鼓泡床的离散颗粒模拟[J].科学通报,2013,58(2):158-169. 被引量：6
4王利民,邱小平,李静海.气固两相流介尺度LBM-DEM模型[J].计算力学学报,2015,32(5):685-692. 被引量：12
5李斌,张尚彬,滕昭钰,边禹铭,张健,巴星原.LBM-DEM四向耦合喷动床内介尺度模拟与分析[J].计算力学学报,2018,35(4):527-532. 被引量：4
6李斌,张尚彬,滕昭钰,张磊,巴兴原,刘哲.基于LBM-DEM耦合模型的多孔射流喷动床内流动特性[J].煤炭学报,2019,44(8):2603-2610. 被引量：2
7徐国伟,陈建,成怡.基于GPU并行计算的雷达杂波模拟研究[J].计算机工程,2020,46(11):306-314. 被引量：4
8Seiya Watanabe,Shintaro Fujisaki,Changhong Hu.Numerical simulation of dam break flow impact on vertical cylinder by cumulant lattice Boltzmann method[J].Journal of Hydrodynamics,2021,33(2):185-194. 被引量：1
9肖汉,郭宝云,李彩林,周清雷.面向异构架构的传递闭包并行算法[J].计算机工程,2021,47(8):131-139. 被引量：3
10郭渝洛,边浩东,董润婷,唐嘉豪,王晓英,黄建强.基于SIMD的并行傅里叶空间图像相似度计算[J].计算机工程,2021,47(11):247-253. 被引量：3

1罗良,邓梨,钱坤,金佳鸿.基于有限随机行走模型的二维多孔介质流场预测研究[J].湖南理工学院学报（自然科学版）,2023,36(3):12-15.
2邓雪阳,邓达平,苏万靖.基于并行深度卷积神经网络的舰船通信异常数据检测研究[J].舰船科学技术,2023,45(15):119-122.
3程艳,张波,姚中原,张宇,曹卫,黄曙荣.基于DAM与CNN-LSTM-XGBoost的海上风电功率并行预测[J].软件导刊,2023,22(7):27-31.
4唐思腾,肖建波,徐刚,李超.基于前后端分离的电视AI内容审核系统设计[J].现代电视技术,2023(12):115-117.
5宋翔,何小泷.基于格子玻尔兹曼方法的双液滴撞击壁面液膜的热特性演化过程[J].科学技术与工程,2024,24(3):1207-1215.
6巴图达来.文化自信视域下高职院校大学生思想政治教育的优化措施探究[J].中文科技期刊数据库（全文版）教育科学,2024(1):0060-0063.
7肖艳阳,李渭,徐少平.三维精确power图的GPU并行计算[J].计算机辅助设计与图形学学报,2023,35(12):1958-1965.
8郑俊华,郑雅伟.采用LTE Cat.1通信的物联网高精度高度计设计[J].集成电路与嵌入式系统,2024,24(2):96-100.
9张爱梅.国土空间规划体系中生态产品价值实现的施秉探索[J].中国资源综合利用,2024,42(1):66-69.
10黄路路,唐舒宇,张伟,代祥光.基于Lp范数的非负矩阵分解并行优化算法[J].计算机科学,2024,51(2):100-106.

计算机工程

2024年第2期

浏览历史

内容加载中请稍等...

基于GPU的LBM迁移模块算法优化

参考文献3

二级参考文献12

相关作者

相关机构

相关主题

浏览历史