期刊文献+
共找到17篇文章
< 1 >
每页显示 20 50 100
An MPI parallel DEM-IMB-LBM framework for simulating fluid-solid interaction problems 被引量:2
1
作者 Ming Xia Liuhong Deng +3 位作者 Fengqiang Gong Tongming Qu Y.T.Feng Jin Yu 《Journal of Rock Mechanics and Geotechnical Engineering》 SCIE CSCD 2024年第6期2219-2231,共13页
The high-resolution DEM-IMB-LBM model can accurately describe pore-scale fluid-solid interactions,but its potential for use in geotechnical engineering analysis has not been fully unleashed due to its prohibitive comp... The high-resolution DEM-IMB-LBM model can accurately describe pore-scale fluid-solid interactions,but its potential for use in geotechnical engineering analysis has not been fully unleashed due to its prohibitive computational costs.To overcome this limitation,a message passing interface(MPI)parallel DEM-IMB-LBM framework is proposed aimed at enhancing computation efficiency.This framework utilises a static domain decomposition scheme,with the entire computation domain being decomposed into multiple subdomains according to predefined processors.A detailed parallel strategy is employed for both contact detection and hydrodynamic force calculation.In particular,a particle ID re-numbering scheme is proposed to handle particle transitions across sub-domain interfaces.Two benchmarks are conducted to validate the accuracy and overall performance of the proposed framework.Subsequently,the framework is applied to simulate scenarios involving multi-particle sedimentation and submarine landslides.The numerical examples effectively demonstrate the robustness and applicability of the MPI parallel DEM-IMB-LBM framework. 展开更多
关键词 Discrete element method(DEM) Lattice Boltzmann method(LBM) Immersed moving boundary(IMB) Multi-cores parallelization message passing interface(MPI) CPU Submarine landslides
下载PDF
Static Analysis Techniques for Fixing Software Defects in MPI-Based Parallel Programs
2
作者 Norah Abdullah Al-Johany Sanaa Abdullah Sharaf +1 位作者 Fathy Elbouraey Eassa Reem Abdulaziz Alnanih 《Computers, Materials & Continua》 SCIE EI 2024年第5期3139-3173,共35页
The Message Passing Interface (MPI) is a widely accepted standard for parallel computing on distributed memorysystems.However, MPI implementations can contain defects that impact the reliability and performance of par... The Message Passing Interface (MPI) is a widely accepted standard for parallel computing on distributed memorysystems.However, MPI implementations can contain defects that impact the reliability and performance of parallelapplications. Detecting and correcting these defects is crucial, yet there is a lack of published models specificallydesigned for correctingMPI defects. To address this, we propose a model for detecting and correcting MPI defects(DC_MPI), which aims to detect and correct defects in various types of MPI communication, including blockingpoint-to-point (BPTP), nonblocking point-to-point (NBPTP), and collective communication (CC). The defectsaddressed by the DC_MPI model include illegal MPI calls, deadlocks (DL), race conditions (RC), and messagemismatches (MM). To assess the effectiveness of the DC_MPI model, we performed experiments on a datasetconsisting of 40 MPI codes. The results indicate that the model achieved a detection rate of 37 out of 40 codes,resulting in an overall detection accuracy of 92.5%. Additionally, the execution duration of the DC_MPI modelranged from 0.81 to 1.36 s. These findings show that the DC_MPI model is useful in detecting and correctingdefects in MPI implementations, thereby enhancing the reliability and performance of parallel applications. TheDC_MPImodel fills an important research gap and provides a valuable tool for improving the quality ofMPI-basedparallel computing systems. 展开更多
关键词 High-performance computing parallel computing software engineering software defect message passing interface DEADLOCK
下载PDF
Development of Ubiquitous Simulation Service Structure Based on High Performance Computing Technologies 被引量:2
3
作者 Sang-Hyun CHO Jeong-Kil CHOI 《Journal of Materials Science & Technology》 SCIE EI CAS CSCD 2008年第3期374-378,共5页
The simulation field became essential in designing or developing new casting products and in improving manufacturing processes within limited time, because it can help us to simulate the nature of processing, so that ... The simulation field became essential in designing or developing new casting products and in improving manufacturing processes within limited time, because it can help us to simulate the nature of processing, so that developers can make ideal casting designs. To take the prior occupation at commercial simulation market, so many development groups in the world are doing their every effort. They already reported successful stories in manufacturing fields by developing and providing the high performance simulation technologies for multipurpose. But they all run at powerful desk-side computers by well-trained experts mainly, so that it is hard to diffuse the scientific designing concept to newcomers in casting field. To overcome upcoming problems in scientific casting designs, we utilized information technologies and full-matured hardware backbones to spread out the effective and scientific casting design mind, and they all were integrated into Simulation Portal on the web. It professes scientific casting design on the NET including ubiquitous access way represented by "Anyone, Anytime, Anywhere" concept for casting designs. 展开更多
关键词 Parallel computation message passing interface (MPI) Shared memory processing (SMP) CLUSTERING UBIQUITOUS
下载PDF
Parallel computation of unified finite-difference time-domain for underwater sound scattering 被引量:2
4
作者 冯玉田 王朔中 《Journal of Shanghai University(English Edition)》 CAS 2008年第2期120-125,共6页
In this work, we treat scattering objects, water, surface and bottom in a truly unified manner in a parallel finitedifference time-domain (FDTD) scheme, which is suitable for distributed parallel computing in a mess... In this work, we treat scattering objects, water, surface and bottom in a truly unified manner in a parallel finitedifference time-domain (FDTD) scheme, which is suitable for distributed parallel computing in a message passing interface (MPI) programming environment. The algorithm is implemented on a cluster-based high performance computer system. Parallel computation is performed with different division methods in 2D and 3D situations. Based on analysis of main factors affecting the speedup rate and parallel efficiency, data communication is reduced by selecting a suitable scheme of task division. A desirable scheme is recommended, giving a higher speedup rate and better efficiency. The results indicate that the unified parallel FDTD algorithm provides a solution to the numerical computation of acoustic scattering. 展开更多
关键词 parallel computation finite-difference time-domain (FDTD) message passing interface (MPI) object scattering.
下载PDF
Coupling analysis of transmission lines excited by space electromagnetic fields based on time domain hybrid method using parallel technique 被引量:1
5
作者 Zhi-Hong Ye Xiao-Lin Wu Yao-Yao Li 《Chinese Physics B》 SCIE EI CAS CSCD 2020年第9期249-254,共6页
We present a time domain hybrid method to realize the fast coupling analysis of transmission lines excited by space electromagnetic fields, in which parallel finite-difference time-domain (FDTD) method, interpolation ... We present a time domain hybrid method to realize the fast coupling analysis of transmission lines excited by space electromagnetic fields, in which parallel finite-difference time-domain (FDTD) method, interpolation scheme, and Agrawal model-based transmission line (TL) equations are organically integrated together. Specifically, the Agrawal model is employed to establish the TL equations to describe the coupling effects of space electromagnetic fields on transmission lines. Then, the excitation fields functioning as distribution sources in TL equations are calculated by the parallel FDTD method through using the message passing interface (MPI) library scheme and interpolation scheme. Finally, the TL equations are discretized by the central difference scheme of FDTD and assigned to multiple processors to obtain the transient responses on the terminal loads of these lines. The significant feature of the presented method is embodied in its parallel and synchronous calculations of the space electromagnetic fields and transient responses on the lines. Numerical simulations of ambient wave acting on multi-conductor transmission lines (MTLs), which are located on the PEC ground and in the shielded cavity respectively, are implemented to verify the accuracy and efficiency of the presented method. 展开更多
关键词 Agrawal model transmission line equations parallel FDTD method message passing interface(MPI)library
下载PDF
Multi-Deme Parallel FGAs-Based Algorithm for Multitarget Tracking 被引量:1
6
作者 刘虎 朱力立 张焕春 《Journal of Electronic Science and Technology of China》 2006年第1期12-17,共6页
For data association in multisensor and multitarget tracking, a novel parallel algorithm is developed to improve the efficiency and real-time performance of FGAs-based algorithm. One Cluster of Workstation (COW) wit... For data association in multisensor and multitarget tracking, a novel parallel algorithm is developed to improve the efficiency and real-time performance of FGAs-based algorithm. One Cluster of Workstation (COW) with Message Passing Interface (MPI) is built. The proposed Multi-Deme Parallel FGA (MDPFGA) is run on the platform. A serial of special MDPFGAs are used to determine the static and the dynamic solutions of generalized m-best S-D assignment problem respectively, as well as target states estimation in track management. Such an assignment-based parallel algorithm is demonstrated on simulated passive sensor track formation and maintenance problem. While illustrating the feasibility of the proposed algorithm in multisensor multitarget tracking, simulation results indicate that the MDPFGAs-based algorithm has greater efficiency and speed than the FGAs-based algorithm. 展开更多
关键词 multitarget tracking multi-deme Fuzzy Genetic Algorithm (FGA) PARALLELIZATION message passing interface (MPI)
下载PDF
Development of high performance casting analysis software by coupled parallel computation
7
作者 Sang Hyun CHO Jeong Kil CHOI 《China Foundry》 SCIE CAS 2007年第3期215-219,共5页
Up to now,so much casting analysis software has been continuing to develop the new access way to real casting processes. Those include the melt flow analysis,heat transfer analysis for solidification calculation,mecha... Up to now,so much casting analysis software has been continuing to develop the new access way to real casting processes. Those include the melt flow analysis,heat transfer analysis for solidification calculation,mechanical property predictions and microstructure predictions. These trials were successful to obtain the ideal results comparing with real situations,so that CAE technologies became inevitable to design or develop new casting processes. But for manufacturing fields,CAE technologies are not so frequently being used because of their difficulties in using the software or insufficient computing performances. To introduce CAE technologies to manufacturing field,the high performance analysis is essential to shorten the gap between product designing time and prototyping time. The software code optimization can be helpful,but it is not enough,because the codes developed by software experts are already optimized enough. As an alternative proposal for high performance computations,the parallel computation technologies are eagerly being applied to CAE technologies to make the analysis time shorter. In this research,SMP (Shared Memory Processing) and MPI (Message Passing Interface) (1) methods for parallelization were applied to commercial software "Z-Cast" to calculate the casting processes. In the code parallelizing processes,the network stabilization,core optimization were also carried out under Microsoft Windows platform and their performances and results were compared with those of normal linear analysis codes. 展开更多
关键词 parallel computation message passing interface casting analysis SMP performance improvement
下载PDF
Large-scale high performance computation on 3D explosion and shockproblems
8
作者 费广磊 马天宝 郝莉 《Applied Mathematics and Mechanics(English Edition)》 SCIE EI 2011年第3期375-382,共8页
Explosion and shock often involve large deformation, interface treatment between multi-material, and strong discontinuity. The Eulerian method has advantages for solving these problems. In parallel computation of the ... Explosion and shock often involve large deformation, interface treatment between multi-material, and strong discontinuity. The Eulerian method has advantages for solving these problems. In parallel computation of the Eulerian method, the physical quantities of the computaional cells do not change before the disturbance reaches to these cells. Computational efficiency is low when using fixed partition because of load imbalance. To solve this problem, a dynamic parallel method in which the computation domain expands with disturbance is used. The dynamic parallel program is designed based on the generally used message passing interface model. The numerical test of dynamic parallel program agrees well with that of the original parallel program, also agrees with the actual situation. 展开更多
关键词 explosion explosion and shock dynamic parallel message passing interface AIR
下载PDF
Real-space parallel density matrix renormalization group with adaptive boundaries
9
作者 Fu-Zhou Chen Chen Cheng Hong-Gang Luo 《Chinese Physics B》 SCIE EI CAS CSCD 2021年第8期191-197,共7页
We propose an improved real-space parallel strategy for the density matrix renormalization group(DMRG)method,where boundaries of separate regions are adaptively distributed during DMRG sweeps.Our scheme greatly improv... We propose an improved real-space parallel strategy for the density matrix renormalization group(DMRG)method,where boundaries of separate regions are adaptively distributed during DMRG sweeps.Our scheme greatly improves the parallel efficiency with shorter waiting time between two adjacent tasks,compared with the original real-space parallel DMRG with fixed boundaries.We implement our new strategy based on the message passing interface(MPI),and dynamically control the number of kept states according to the truncation error in each DMRG step.We study the performance of the new parallel strategy by calculating the ground state of a spin-cluster chain and a quantum chemical Hamiltonian of the water molecule.The maximum parallel efficiencies for these two models are 91%and 76%in 4 nodes,which are much higher than the real-space parallel DMRG with fixed boundaries. 展开更多
关键词 density matrix renormalization group strongly correlated systems message passing interface
下载PDF
An efficient parallel algorithm for ocean circulation numerical model based on irregular rectangle decomposition scheme
10
作者 ZHUANG Zhanpeng YUAN Yeli +2 位作者 ZHANG Jie HAN Lei YANG Jungang 《Acta Oceanologica Sinica》 SCIE CAS CSCD 2016年第5期18-23,共6页
A parallel algorithm of circulation numerical model based on message passing interface(MPI) is developed using serialization and an irregular rectangle decomposition scheme. Neighboring point exchange strategy(NPES... A parallel algorithm of circulation numerical model based on message passing interface(MPI) is developed using serialization and an irregular rectangle decomposition scheme. Neighboring point exchange strategy(NPES) is adopted to further enhance the computational efficiency. Two experiments are conducted on HP C7000 Blade System, the numerical results show that the parallel version with NPES(PVN) produces higher efficiency than the original parallel version(PV). The PVN achieves parallel efficiency in excess of 0.9 in the second experiment when the number of processors increases to 100, while the efficiency of PV decreases to 0.39 rapidly. The PVN of ocean circulation model is used in a fine-resolution regional simulation, which produces better results. The capability of universal implementation of this algorithm makes it applicable in many other ocean models potentially. 展开更多
关键词 irregular rectangle decomposition scheme message passing interface(MPI) neighboring point exchange strategy data communication
下载PDF
High-Performance Flow Classification of Big Data Using Hybrid CPU-GPU Clusters of Cloud Environments
11
作者 Azam Fazel-Najafabadi Mahdi Abbasi +5 位作者 Hani H.Attar Ayman Amer Amir Taherkordi Azad Shokrollahi Mohammad R.Khosravi Ahmed A.Solyman 《Tsinghua Science and Technology》 SCIE EI CAS CSCD 2024年第4期1118-1137,共20页
The network switches in the data plane of Software Defined Networking (SDN) are empowered by an elementary process, in which enormous number of packets which resemble big volumes of data are classified into specific f... The network switches in the data plane of Software Defined Networking (SDN) are empowered by an elementary process, in which enormous number of packets which resemble big volumes of data are classified into specific flows by matching them against a set of dynamic rules. This basic process accelerates the processing of data, so that instead of processing singular packets repeatedly, corresponding actions are performed on corresponding flows of packets. In this paper, first, we address limitations on a typical packet classification algorithm like Tuple Space Search (TSS). Then, we present a set of different scenarios to parallelize it on different parallel processing platforms, including Graphics Processing Units (GPUs), clusters of Central Processing Units (CPUs), and hybrid clusters. Experimental results show that the hybrid cluster provides the best platform for parallelizing packet classification algorithms, which promises the average throughput rate of 4.2 Million packets per second (Mpps). That is, the hybrid cluster produced by the integration of Compute Unified Device Architecture (CUDA), Message Passing Interface (MPI), and OpenMP programming model could classify 0.24 million packets per second more than the GPU cluster scheme. Such a packet classifier satisfies the required processing speed in the programmable network systems that would be used to communicate big medical data. 展开更多
关键词 OPENMP Compute Unified Device Architecture(CUDA) message passing interface(MPI) packet classification medical data tuple space algorithm Graphics Processing Unit(GPU)cluster
原文传递
High Performance MPI over the Slingshot Interconnect
12
作者 Kawthar Shafie Khorassani Chen-Chun Chen +3 位作者 Bharath Ramesh Aamir Shafi Hari Subramoni Dhabaleswar K.Panda 《Journal of Computer Science & Technology》 SCIE EI CSCD 2023年第1期128-145,共18页
The Slingshot interconnect designed by HPE/Cray is becoming more relevant in high-performance computing with its deployment on the upcoming exascale systems.In particular,it is the interconnect empowering the first ex... The Slingshot interconnect designed by HPE/Cray is becoming more relevant in high-performance computing with its deployment on the upcoming exascale systems.In particular,it is the interconnect empowering the first exascale and highest-ranked supercomputer in the world,Frontier.It offers various features such as adaptive routing,congestion control,and isolated workloads.The deployment of newer interconnects sparks interest related to performance,scalability,and any potential bottlenecks as they are critical elements contributing to the scalability across nodes on these systems.In this paper,we delve into the challenges the Slingshot interconnect poses with current state-of-the-art MPI(message passing interface)libraries.In particular,we look at the scalability performance when using Slingshot across nodes.We present a comprehensive evaluation using various MPI and communication libraries including Cray MPICH,Open-MPI+UCX,RCCL,and MVAPICH2 on CPUs and GPUs on the Spock system,an early access cluster deployed with Slingshot-10,AMD MI100 GPUs and AMD Epyc Rome CPUs to emulate the Frontier system.We also evaluate preliminary CPU-based support of MPI libraries on the Slingshot-11 interconnect. 展开更多
关键词 AMD GPU interconnect technology MPI(message passing interface) Slingshot
原文传递
An MPI+OpenACC-Based PRM Scalar Advection Scheme in the GRAPES Model over a Cluster with Multiple CPUs and GPUs 被引量:2
13
作者 Huadong Xiao Yang Lu +1 位作者 Jianqiang Huang Wei Xue 《Tsinghua Science and Technology》 SCIE EI CAS CSCD 2022年第1期164-173,共10页
A moisture advection scheme is an essential module of a numerical weather/climate model representing the horizontal transport of water vapor.The Piecewise Rational Method(PRM) scalar advection scheme in the Global/Reg... A moisture advection scheme is an essential module of a numerical weather/climate model representing the horizontal transport of water vapor.The Piecewise Rational Method(PRM) scalar advection scheme in the Global/Regional Assimilation and Prediction System(GRAPES) solves the moisture flux advection equation based on PRM.Computation of the scalar advection involves boundary exchange,and computation of higher bandwidth requirements is complicated and time-consuming in GRAPES.Recently,Graphics Processing Units(GPUs) have been widely used to solve scientific and engineering computing problems owing to advancements in GPU hardware and related programming models such as CUDA/OpenCL and Open Accelerator(OpenACC).Herein,we present an accelerated PRM scalar advection scheme with Message Passing Interface(MPI) and OpenACC to fully exploit GPUs’ power over a cluster with multiple Central Processing Units(CPUs) and GPUs,together with optimization of various parameters such as minimizing data transfer,memory coalescing,exposing more parallelism,and overlapping computation with data transfers.Results show that about 3.5 times speedup is obtained for the entire model running at medium resolution with double precision when comparing the scheme’s elapsed time on a node with two GPUs(NVIDIA P100) and two 16-core CPUs(Intel Gold 6142).Further,results obtained from experiments of a higher resolution model with multiple GPUs show excellent scalability. 展开更多
关键词 Graphics Processing Unit(GPU)computing Open Accelerator(OpenACC) message passing interface(MPI) Global/Regional Assimilation and Prediction System(GRAPES) Piecewise Rational Method(PRM)scalar advection scheme
原文传递
GPU acceleration of a nonhydrostatic model for the internal solitary waves simulation 被引量:1
14
作者 陈同庆 张庆河 《Journal of Hydrodynamics》 SCIE EI CSCD 2013年第3期362-369,共8页
The parallel computing algorithm for a nonhydrostatic model on one or multiple Graphic Processing Units (GPUs) for the simulation of internal solitary waves is presented and discussed. The computational efficiency o... The parallel computing algorithm for a nonhydrostatic model on one or multiple Graphic Processing Units (GPUs) for the simulation of internal solitary waves is presented and discussed. The computational efficiency of the GPU scheme is analyzed by a series of numerical experiments, including an ideal case and the field scale simulations, performed on the workstation and the super- computer system. The calculated results show that the speedup of the developed GPU-based parallel computing scheme, compared to the implementation on a single CPU core, increases with the number of computational grid cells, and the speedup can increase quasi- linearly with respect to the number of involved GPUs for the problem with relatively large number of grid cells within 32 GPUs. 展开更多
关键词 Graphic Processing Unit (GPU) intemal solitary wave nonhydrostatic model SPEEDUP message passing interface (MPI)
原文传递
PMODTRAN:a parallel implementation based on MODTRAN for massive remote sensing data processing
15
作者 Fang Huang Ji Zhou +3 位作者 Jian Tao Xicheng Tan Shunlin Liang Jie Cheng 《International Journal of Digital Earth》 SCIE EI CSCD 2016年第9期819-834,共16页
MODerate resolution atmospheric TRANsmission(MODTRAN)is a commercial remote sensing(RS)software package that has been widely used to simulate radiative transfer of electromagnetic radiation through the Earth’s atmosp... MODerate resolution atmospheric TRANsmission(MODTRAN)is a commercial remote sensing(RS)software package that has been widely used to simulate radiative transfer of electromagnetic radiation through the Earth’s atmosphere and the radiation observed by a remote sensor.However,when very large RS datasets must be processed in simulation applications at a global scale,it is extremely time-consuming to operate MODTRAN on a modern workstation.Under this circumstance,the use of parallel cluster computing to speed up the process becomes vital to this time-consuming task.This paper presents PMODTRAN,an implementation of a parallel task-scheduling algorithm based on MODTRAN.PMODTRAN was able to reduce the processing time of the test cases used here from over 4.4 months on a workstation to less than a week on a local computer cluster.In addition,PMODTRAN can distribute tasks with different levels of granularity and has some extra features,such as dynamic load balancing and parameter checking. 展开更多
关键词 Parallel computing message passing interface MODTRAN thermal infrared remote sensing land-surface temperature retrieval
原文传递
Fast Multicast on Multistage Interconnection Networks Using Multi-Head Worms
16
作者 王晓东 徐明 周兴铭 《Journal of Computer Science & Technology》 SCIE EI CSCD 1999年第3期250-258,共9页
This paper proposes a new approach for implementing fast multicast on multistage interconnection networks (MINs) with multi-head worms. For an MIN with n stages of k×k switches, a single multi-head worm can cover... This paper proposes a new approach for implementing fast multicast on multistage interconnection networks (MINs) with multi-head worms. For an MIN with n stages of k×k switches, a single multi-head worm can cover an arbitrary set of destinations with a single communication start-up. Compared with schemes using unicast messages, this approach reduces multicast latency significantly and performs better than multi-destination worms. 展开更多
关键词 MULTICAST message passing interface (MPI) multi-head worm multistage interconnection networks (MINs) wormhole routing
原文传递
A new method to retrieve aerosol optical thickness from satellite images on a parallel system
17
作者 Jianping Guo Huadong Xiao +5 位作者 Yong Xue Huizheng Che Xiaoye Zhang Chunxiang Cao Jie Guang Hao Zhang 《Particuology》 SCIE EI CAS CSCD 2009年第5期392-398,共7页
A wide variety of algorithms have been developed to monitor aerosol burden from satellite images. Still, few solutions currently allow for real-time and efficient retrieval of aerosol optical thickness (AOT), mainly... A wide variety of algorithms have been developed to monitor aerosol burden from satellite images. Still, few solutions currently allow for real-time and efficient retrieval of aerosol optical thickness (AOT), mainly due to the extremely large volume of computation necessary for the numeric solution of atmospheric radiative transfer equations. Taking into account the efforts to exploit the SYNergy of Terra and Aqua Modis (SYNTAM, an AOT retrieval algorithm), we present in this paper a novel method to retrieve AOT from Moderate Resolution Imaging Spectroradiometer (MODIS) satellite images, in which the strategy of block partition and collective communication was taken, thereby maximizing load balance and reducing the overhead time during inter-processor communication. Experiments were carried out to retrieve AOT at 0.44, 0.55, and 0.67μm of MODIS/Terra and MODIS/Aqua data, using the parallel SYNTAM algorithm in the IBM System Cluster 1600 deployed at China Meteorological Administration (CMA). Results showed that parallel implementation can greatly reduce computation time, and thus ensure high parallel efficiency. AOT derived by parallel algorithm was validated against measurements from ground-based sun-photometers; in all cases, the relative error range was within 20%, which demonstrated that the parallel algorithm was suitable for applications such as air quality monitoring and climate modeling. 展开更多
关键词 AOT Parallel computation Block partitioning message passing interface (MPI)
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部