A moisture advection scheme is an essential module of a numerical weather/climate model representing the horizontal transport of water vapor.The Piecewise Rational Method(PRM) scalar advection scheme in the Global/Reg...A moisture advection scheme is an essential module of a numerical weather/climate model representing the horizontal transport of water vapor.The Piecewise Rational Method(PRM) scalar advection scheme in the Global/Regional Assimilation and Prediction System(GRAPES) solves the moisture flux advection equation based on PRM.Computation of the scalar advection involves boundary exchange,and computation of higher bandwidth requirements is complicated and time-consuming in GRAPES.Recently,Graphics Processing Units(GPUs) have been widely used to solve scientific and engineering computing problems owing to advancements in GPU hardware and related programming models such as CUDA/OpenCL and Open Accelerator(OpenACC).Herein,we present an accelerated PRM scalar advection scheme with Message Passing Interface(MPI) and OpenACC to fully exploit GPUs’ power over a cluster with multiple Central Processing Units(CPUs) and GPUs,together with optimization of various parameters such as minimizing data transfer,memory coalescing,exposing more parallelism,and overlapping computation with data transfers.Results show that about 3.5 times speedup is obtained for the entire model running at medium resolution with double precision when comparing the scheme’s elapsed time on a node with two GPUs(NVIDIA P100) and two 16-core CPUs(Intel Gold 6142).Further,results obtained from experiments of a higher resolution model with multiple GPUs show excellent scalability.展开更多
A wide variety of algorithms have been developed to monitor aerosol burden from satellite images. Still, few solutions currently allow for real-time and efficient retrieval of aerosol optical thickness (AOT), mainly...A wide variety of algorithms have been developed to monitor aerosol burden from satellite images. Still, few solutions currently allow for real-time and efficient retrieval of aerosol optical thickness (AOT), mainly due to the extremely large volume of computation necessary for the numeric solution of atmospheric radiative transfer equations. Taking into account the efforts to exploit the SYNergy of Terra and Aqua Modis (SYNTAM, an AOT retrieval algorithm), we present in this paper a novel method to retrieve AOT from Moderate Resolution Imaging Spectroradiometer (MODIS) satellite images, in which the strategy of block partition and collective communication was taken, thereby maximizing load balance and reducing the overhead time during inter-processor communication. Experiments were carried out to retrieve AOT at 0.44, 0.55, and 0.67μm of MODIS/Terra and MODIS/Aqua data, using the parallel SYNTAM algorithm in the IBM System Cluster 1600 deployed at China Meteorological Administration (CMA). Results showed that parallel implementation can greatly reduce computation time, and thus ensure high parallel efficiency. AOT derived by parallel algorithm was validated against measurements from ground-based sun-photometers; in all cases, the relative error range was within 20%, which demonstrated that the parallel algorithm was suitable for applications such as air quality monitoring and climate modeling.展开更多
基金supported by the decision support project of response to climate change of China,the National Natural Science Foundation of China (Nos.41674085, 41604009, and 41621091)the Natural Science Foundation of Qinghai Province (No. 2019-ZJ-7034)the Open Project of State Key Laboratory of Plateau Ecology and Agriculture,Qinghai University (No. 2020-zz-03)。
文摘A moisture advection scheme is an essential module of a numerical weather/climate model representing the horizontal transport of water vapor.The Piecewise Rational Method(PRM) scalar advection scheme in the Global/Regional Assimilation and Prediction System(GRAPES) solves the moisture flux advection equation based on PRM.Computation of the scalar advection involves boundary exchange,and computation of higher bandwidth requirements is complicated and time-consuming in GRAPES.Recently,Graphics Processing Units(GPUs) have been widely used to solve scientific and engineering computing problems owing to advancements in GPU hardware and related programming models such as CUDA/OpenCL and Open Accelerator(OpenACC).Herein,we present an accelerated PRM scalar advection scheme with Message Passing Interface(MPI) and OpenACC to fully exploit GPUs’ power over a cluster with multiple Central Processing Units(CPUs) and GPUs,together with optimization of various parameters such as minimizing data transfer,memory coalescing,exposing more parallelism,and overlapping computation with data transfers.Results show that about 3.5 times speedup is obtained for the entire model running at medium resolution with double precision when comparing the scheme’s elapsed time on a node with two GPUs(NVIDIA P100) and two 16-core CPUs(Intel Gold 6142).Further,results obtained from experiments of a higher resolution model with multiple GPUs show excellent scalability.
基金supported partly by the Ministry of Science and Technology of the People’s Republic of China (Grant Nos.2007CB714407, and 2008ZX10004012)the Special Funds for Basic Research in CAMS of CMA (Grant No. 2007Y001)State Key Laboratory of Remote Sensing Sciences (Grant No.07S00502CX)
文摘A wide variety of algorithms have been developed to monitor aerosol burden from satellite images. Still, few solutions currently allow for real-time and efficient retrieval of aerosol optical thickness (AOT), mainly due to the extremely large volume of computation necessary for the numeric solution of atmospheric radiative transfer equations. Taking into account the efforts to exploit the SYNergy of Terra and Aqua Modis (SYNTAM, an AOT retrieval algorithm), we present in this paper a novel method to retrieve AOT from Moderate Resolution Imaging Spectroradiometer (MODIS) satellite images, in which the strategy of block partition and collective communication was taken, thereby maximizing load balance and reducing the overhead time during inter-processor communication. Experiments were carried out to retrieve AOT at 0.44, 0.55, and 0.67μm of MODIS/Terra and MODIS/Aqua data, using the parallel SYNTAM algorithm in the IBM System Cluster 1600 deployed at China Meteorological Administration (CMA). Results showed that parallel implementation can greatly reduce computation time, and thus ensure high parallel efficiency. AOT derived by parallel algorithm was validated against measurements from ground-based sun-photometers; in all cases, the relative error range was within 20%, which demonstrated that the parallel algorithm was suitable for applications such as air quality monitoring and climate modeling.