电力输变电系统中的工频电场数值计算对于输变电系统的设计建设至关重要,而传统的方法在Matlab环境下求解三维电场耗时过多,本文以500 k V特高压输电走廊下电场计算为模型,基于Open MP多核并行环境,采用Matlab和C语言混合编写模拟电荷...电力输变电系统中的工频电场数值计算对于输变电系统的设计建设至关重要,而传统的方法在Matlab环境下求解三维电场耗时过多,本文以500 k V特高压输电走廊下电场计算为模型,基于Open MP多核并行环境,采用Matlab和C语言混合编写模拟电荷法的并行算法,既利用了Matlab生成矩阵的便捷性和对数据图像处理的强大功能,又利用了C语言的对于计算的快速编译性以及对于并行计算的可塑性,实现并行加速的效果,为解决数值算法求解三维工频电场的耗时问题提供计算平台。展开更多
An efficient MPI/OpenMP hybrid parallel Radial Basis Function (RBF) strategy for both continuous and discontinuous large-scale mesh deformation is proposed to reduce the computational cost and memory consumption.Unlik...An efficient MPI/OpenMP hybrid parallel Radial Basis Function (RBF) strategy for both continuous and discontinuous large-scale mesh deformation is proposed to reduce the computational cost and memory consumption.Unlike the conventional parallel methods in which all processors use the same surface displacement and implement the same operation,the present method employs different surface points sets and influence radius for each volume point movement,accompanied with efficient geometry searching strategy.The deformed surface points,also called Control Points (CPs),are stored in each processor.The displacement of spatial points is interpolated by using only 20-50 nearest control points,and the local influence radius is set to 5-20 times the maximum displacement of control points.To shorten the searching time for the nearest control point clouds,an Alternating Digital Tree (ADT) algorithm for 3D complex geometry is designed based on an iterative bisection technique.Besides,an MPI/OpenMP hybrid parallel approach is developed to reduce the memory cost in each High-Performance Computing (HPC) node for large-scale applications.Three 3D cases,including the ONERA-M6 wing and a commercial transport airplane standard model with up to 2.5 billion hybrid elements,are used to test the present mesh deformation method.The robustness and high parallel efficiency are demonstrated by a wing deflection case with a maximum bending angle of 450 and more than 80% parallel efficiency with 1024 MPI processors.In addition,the availability for both continuous and discontinuous surface deformation is verified by interpolating the projecting displacement with opposite directions surface points to the spatial points.展开更多
A moisture advection scheme is an essential module of a numerical weather/climate model representing the horizontal transport of water vapor.The Piecewise Rational Method(PRM) scalar advection scheme in the Global/Reg...A moisture advection scheme is an essential module of a numerical weather/climate model representing the horizontal transport of water vapor.The Piecewise Rational Method(PRM) scalar advection scheme in the Global/Regional Assimilation and Prediction System(GRAPES) solves the moisture flux advection equation based on PRM.Computation of the scalar advection involves boundary exchange,and computation of higher bandwidth requirements is complicated and time-consuming in GRAPES.Recently,Graphics Processing Units(GPUs) have been widely used to solve scientific and engineering computing problems owing to advancements in GPU hardware and related programming models such as CUDA/OpenCL and Open Accelerator(OpenACC).Herein,we present an accelerated PRM scalar advection scheme with Message Passing Interface(MPI) and OpenACC to fully exploit GPUs’ power over a cluster with multiple Central Processing Units(CPUs) and GPUs,together with optimization of various parameters such as minimizing data transfer,memory coalescing,exposing more parallelism,and overlapping computation with data transfers.Results show that about 3.5 times speedup is obtained for the entire model running at medium resolution with double precision when comparing the scheme’s elapsed time on a node with two GPUs(NVIDIA P100) and two 16-core CPUs(Intel Gold 6142).Further,results obtained from experiments of a higher resolution model with multiple GPUs show excellent scalability.展开更多
文摘电力输变电系统中的工频电场数值计算对于输变电系统的设计建设至关重要,而传统的方法在Matlab环境下求解三维电场耗时过多,本文以500 k V特高压输电走廊下电场计算为模型,基于Open MP多核并行环境,采用Matlab和C语言混合编写模拟电荷法的并行算法,既利用了Matlab生成矩阵的便捷性和对数据图像处理的强大功能,又利用了C语言的对于计算的快速编译性以及对于并行计算的可塑性,实现并行加速的效果,为解决数值算法求解三维工频电场的耗时问题提供计算平台。
基金supported by the National Key Research and Development Program of China (No.2016YFB0200701)the National Natural Science Foundation of China (Nos. 11532016 and 91530325)
文摘An efficient MPI/OpenMP hybrid parallel Radial Basis Function (RBF) strategy for both continuous and discontinuous large-scale mesh deformation is proposed to reduce the computational cost and memory consumption.Unlike the conventional parallel methods in which all processors use the same surface displacement and implement the same operation,the present method employs different surface points sets and influence radius for each volume point movement,accompanied with efficient geometry searching strategy.The deformed surface points,also called Control Points (CPs),are stored in each processor.The displacement of spatial points is interpolated by using only 20-50 nearest control points,and the local influence radius is set to 5-20 times the maximum displacement of control points.To shorten the searching time for the nearest control point clouds,an Alternating Digital Tree (ADT) algorithm for 3D complex geometry is designed based on an iterative bisection technique.Besides,an MPI/OpenMP hybrid parallel approach is developed to reduce the memory cost in each High-Performance Computing (HPC) node for large-scale applications.Three 3D cases,including the ONERA-M6 wing and a commercial transport airplane standard model with up to 2.5 billion hybrid elements,are used to test the present mesh deformation method.The robustness and high parallel efficiency are demonstrated by a wing deflection case with a maximum bending angle of 450 and more than 80% parallel efficiency with 1024 MPI processors.In addition,the availability for both continuous and discontinuous surface deformation is verified by interpolating the projecting displacement with opposite directions surface points to the spatial points.
基金supported by the decision support project of response to climate change of China,the National Natural Science Foundation of China (Nos.41674085, 41604009, and 41621091)the Natural Science Foundation of Qinghai Province (No. 2019-ZJ-7034)the Open Project of State Key Laboratory of Plateau Ecology and Agriculture,Qinghai University (No. 2020-zz-03)。
文摘A moisture advection scheme is an essential module of a numerical weather/climate model representing the horizontal transport of water vapor.The Piecewise Rational Method(PRM) scalar advection scheme in the Global/Regional Assimilation and Prediction System(GRAPES) solves the moisture flux advection equation based on PRM.Computation of the scalar advection involves boundary exchange,and computation of higher bandwidth requirements is complicated and time-consuming in GRAPES.Recently,Graphics Processing Units(GPUs) have been widely used to solve scientific and engineering computing problems owing to advancements in GPU hardware and related programming models such as CUDA/OpenCL and Open Accelerator(OpenACC).Herein,we present an accelerated PRM scalar advection scheme with Message Passing Interface(MPI) and OpenACC to fully exploit GPUs’ power over a cluster with multiple Central Processing Units(CPUs) and GPUs,together with optimization of various parameters such as minimizing data transfer,memory coalescing,exposing more parallelism,and overlapping computation with data transfers.Results show that about 3.5 times speedup is obtained for the entire model running at medium resolution with double precision when comparing the scheme’s elapsed time on a node with two GPUs(NVIDIA P100) and two 16-core CPUs(Intel Gold 6142).Further,results obtained from experiments of a higher resolution model with multiple GPUs show excellent scalability.