This study embarks on a comprehensive examination of optimization techniques within GPU-based parallel programming models,pivotal for advancing high-performance computing(HPC).Emphasizing the transition of GPUs from g...This study embarks on a comprehensive examination of optimization techniques within GPU-based parallel programming models,pivotal for advancing high-performance computing(HPC).Emphasizing the transition of GPUs from graphic-centric processors to versatile computing units,it delves into the nuanced optimization of memory access,thread management,algorithmic design,and data structures.These optimizations are critical for exploiting the parallel processing capabilities of GPUs,addressingboth the theoretical frameworks and practical implementations.By integrating advanced strategies such as memory coalescing,dynamic scheduling,and parallel algorithmic transformations,this research aims to significantly elevate computational efficiency and throughput.The findings underscore the potential of optimized GPU programming to revolutionize computational tasks across various domains,highlighting a pathway towards achieving unparalleled processing power and efficiency in HPC environments.The paper not only contributes to the academic discourse on GPU optimization but also provides actionable insights for developers,fostering advancements in computational sciences and technology.展开更多
Cloud computing is expanding widely in the world of IT infrastructure. This is due partly to the cost-saving effect of economies of scale. Fair market conditions can in theory provide a healthy environment to reflect ...Cloud computing is expanding widely in the world of IT infrastructure. This is due partly to the cost-saving effect of economies of scale. Fair market conditions can in theory provide a healthy environment to reflect the most reasonable costs of computations. While fixed cloud pricing provides an attractive low entry barrier for compute-intensive applications, both the consumer and supplier of computing resources can see high efficiency for their investments by participating in auction-based exchanges. There are huge incentives for the cloud provider to offer auctioned resources. However, from the consumer perspective, using these resources is a sparsely discussed challenge. This paper reports a methodology and framework designed to address the challenges of using HPC (High Performance Computing) applications on auction-based cloud clusters. The authors focus on HPC applications and describe a method for determining bid-aware checkpointing intervals. They extend a theoretical model for determining checkpoint intervals using statistical analysis of pricing histories. Also the latest developments in the SpotHPC framework are introduced which aim at facilitating the managed execution of real MPI applications on auction-based cloud environments. The authors use their model to simulate a set of algorithms with different computing and communication densities. The results show the complex interactions between optimal bidding strategies and parallel applications performance.展开更多
Discrete element method can effectively simulate the discontinuity,inhomogeneity and large deformation and failure of rock and soil.Based on the innovative matrix computing of the discrete element method,the highperfo...Discrete element method can effectively simulate the discontinuity,inhomogeneity and large deformation and failure of rock and soil.Based on the innovative matrix computing of the discrete element method,the highperformance discrete element software MatDEM may handle millions of elements in one computer,and enables the discrete element simulation at the engineering scale.It supports heat calculation,multi-field and fluidsolid coupling numerical simulations.Furthermore,the software integrates pre-processing,solver,postprocessing,and powerful secondary development,allowing recompiling new discrete element software.The basic principles of the DEM,the implement and development of the MatDEM software,and its applications are introduced in this paper.The software and sample source code are available online(http://matdem.com).展开更多
Factors that have effect on concrete creep include mixture composition,curing conditions,ambient exposure conditions,and element geometry.Considering concrete mixtures influence and in order to improve the prediction ...Factors that have effect on concrete creep include mixture composition,curing conditions,ambient exposure conditions,and element geometry.Considering concrete mixtures influence and in order to improve the prediction of prestress loss in important structures,an experimental test under laboratory conditions was carried out to investigate compression creep of two high performance concrete mixtures used for prestressed members in one bridge.Based on the experimental results,a power exponent function of creep degree for structural numerical analysis was used to model the creep degree of two HPCs,and two series of parameters of this function for two HPCs were calculated with evolution program optimum method.The experimental data was compared with CEB-FIP 90 and ACI 209(92) models,and the two code models both overestimated creep degrees of the two HPCs.So it is recommended that the power exponent function should be used in this bridge structure analysis.展开更多
In order to investigate the compression creep of two kinds of high-performance concrete mixtures used for prestressed members in a bridge,an experimental test under laboratory conditions was carried out.Based on the e...In order to investigate the compression creep of two kinds of high-performance concrete mixtures used for prestressed members in a bridge,an experimental test under laboratory conditions was carried out.Based on the experimental results,a power exponent function was used to model the creep degree of these high-performance concretes(HPCs) for structural numerical analysis,and two series parameters of this function for the HPCs were given with the optimum method of evolution program.The experimental data were compared with CEB-FIP 90 and ACI 92 models.Results show that the two code models both overestimate the creep degree of two HPCs,so it is recommended that the power exponent function should be used for the creep analysis of bridge structure.展开更多
The meteorological high-performance computing resource is the support platform for the weather forecast and climate prediction numerical model operation. The scientific and objective method to evaluate the application...The meteorological high-performance computing resource is the support platform for the weather forecast and climate prediction numerical model operation. The scientific and objective method to evaluate the application of meteorological high-performance computing resources can not only provide reference for the optimization of active resources, but also provide a quantitative basis for future resource construction and planning. In this paper, the concept of the utility value B and index compliance rate E of the meteorological high performance computing system are presented. The evaluation process, evaluation index and calculation method of the high performance computing resource application benefits are introduced.展开更多
Within the last few decades, increases in computational resources have contributed enormously to the progress of science and engineering (S & E). To continue making rapid advancements, the S & E community must...Within the last few decades, increases in computational resources have contributed enormously to the progress of science and engineering (S & E). To continue making rapid advancements, the S & E community must be able to access computing resources. One way to provide such resources is through High-Performance Computing (HPC) centers. Many academic research institutions offer their own HPC Centers but struggle to make the computing resources easily accessible and user-friendly. Here we present SHABU, a RESTful Web API framework that enables S & E communities to access resources from Boston University’s Shared Computing Center (SCC). The SHABU requirements are derived from the use cases described in this work.展开更多
In the last two decades, computational hydraulics has undergone a rapid development following the advancement of data acquisition and computing technologies. Using a finite-volume Godunov-type hydrodynamic model, this...In the last two decades, computational hydraulics has undergone a rapid development following the advancement of data acquisition and computing technologies. Using a finite-volume Godunov-type hydrodynamic model, this work demonstrates the promise of modern high-performance computing technology to achieve real-time flood modeling at a regional scale. The software is implemented for high-performance heterogeneous computing using the OpenCL programming framework, and developed to support simulations across multiple GPUs using a domain decomposition technique and across multiple systems through an efficient implementation of the Message Passing Interface (MPI) standard. The software is applied for a convective storm induced flood event in Newcastle upon Tyne, demonstrating high computational performance across a GPU cluster, and good agreement against crowd- sourced observations. Issues relating to data availability, complex urban topography and differences in drainage capacity affect results for a small number of areas.展开更多
The study of global climate change seeks to understand:(1)the components of the Earth’s varying environmental system,with a particular focus on climate;(2)how these components interact to determine present conditions...The study of global climate change seeks to understand:(1)the components of the Earth’s varying environmental system,with a particular focus on climate;(2)how these components interact to determine present conditions;(3)the factors driving these components;(4)the history of global change and the projection of future change;and(5)how knowledge about global environmental variability and change can be applied to present-day and future decision-making.This paper addresses the use of high-performance computing and high-throughput computing for a global change study on the Digital Earth(DE)platform.Two aspects of the use of high-performance computing(HPC)/high-throughput computing(HTC)on the DE platform are the processing of data from all sources,especially Earth observation data,and the simulation of global change models.The HPC/HTC is an essential and efficient tool for the processing of vast amounts of global data,especially Earth observation data.The current trend involves running complex global climate models using potentially millions of personal computers to achieve better climate change predictions than would ever be possible using the supercomputers currently available to scientists.展开更多
Storage backends of parallel compute clusters are still based mostly on magnetic disks,while newer and faster storage technologies such as flash-based SSDs or non-volatile random access memory(NVRAM)are deployed withi...Storage backends of parallel compute clusters are still based mostly on magnetic disks,while newer and faster storage technologies such as flash-based SSDs or non-volatile random access memory(NVRAM)are deployed within compute nodes.Including these new storage technologies into scientific workflows is unfortunately today a mostly manual task,and most scientists therefore do not take advantage of the faster storage media.One approach to systematically include nodelocal SSDs or NVRAMs into scientific workflows is to deploy ad hoc file systems over a set of compute nodes,which serve as temporary storage systems for single applications or longer-running campaigns.This paper presents results from the Dagstuhl Seminar 17202"Challenges and Opportunities of User-Level File Systems for HPC"and discusses application scenarios as well as design strategies for ad hoc file systems using node-local storage media.The discussion includes open research questions,such as how to couple ad hoc file systems with the batch scheduling environment and how to schedule stage-in and stage-out processes of data between the storage backend and the ad hoc file systems.Also presented are strategies to build ad hoc file systems by using reusable components for networking and how to improve storage device compatibility.Various interfaces and semantics are presented,for example those used by the three ad hoc file systems BeeOND,GekkoFS,and BurstFS.Their presentation covers a range from file systems running in production to cutting-edge research focusing on reaching the performance limits of the underlying devices.展开更多
Wide-area high-performance computing is widely used for large-scale parallel computing applications owing to its high computing and storage resources.However,the geographical distribution of computing and storage reso...Wide-area high-performance computing is widely used for large-scale parallel computing applications owing to its high computing and storage resources.However,the geographical distribution of computing and storage resources makes efficient task distribution and data placement more challenging.To achieve a higher system performance,this study proposes a two-level global collaborative scheduling strategy for wide-area high-performance computing environments.The collaborative scheduling strategy integrates lightweight solution selection,redundant data placement and task stealing mechanisms,optimizing task distribution and data placement to achieve efficient computing in wide-area environments.The experimental results indicate that compared with the state-of-the-art collaborative scheduling algorithm HPS+,the proposed scheduling strategy reduces the makespan by 23.24%,improves computing and storage resource utilization by 8.28%and 21.73%respectively,and achieves similar global data migration costs.展开更多
Technology enhancements and the growing breadth of application workflows running on high-performance computing(HPC)platforms drive the development of new data services that provide high performance on these new platfo...Technology enhancements and the growing breadth of application workflows running on high-performance computing(HPC)platforms drive the development of new data services that provide high performance on these new platforms,provide capable and productive interfaces and abstractions for a variety of applications,and are readily adapted when new technologies are deployed.The Mochi framework enables composition of specialized distributed data services from a collection of connectable modules and subservices.Rather than forcing all applications to use a one-size-fits-all data staging and I/O software configuration,Mochi allows each application to use a data service specialized to its needs and access patterns.This paper introduces the Mochi framework and methodology.The Mochi core components and microservices are described.Examples of the application of the Mochi methodology to the development of four specialized services are detailed.Finally,a performance evaluation of a Mochi core component,a Mochi microservice,and a composed service providing an object model is performed.The paper concludes by positioning Mochi relative to related work in the HPC space and indicating directions for future work.展开更多
文摘This study embarks on a comprehensive examination of optimization techniques within GPU-based parallel programming models,pivotal for advancing high-performance computing(HPC).Emphasizing the transition of GPUs from graphic-centric processors to versatile computing units,it delves into the nuanced optimization of memory access,thread management,algorithmic design,and data structures.These optimizations are critical for exploiting the parallel processing capabilities of GPUs,addressingboth the theoretical frameworks and practical implementations.By integrating advanced strategies such as memory coalescing,dynamic scheduling,and parallel algorithmic transformations,this research aims to significantly elevate computational efficiency and throughput.The findings underscore the potential of optimized GPU programming to revolutionize computational tasks across various domains,highlighting a pathway towards achieving unparalleled processing power and efficiency in HPC environments.The paper not only contributes to the academic discourse on GPU optimization but also provides actionable insights for developers,fostering advancements in computational sciences and technology.
基金"This paper is an extended version of "SpotMPl: a framework for auction-based HPC computing using amazon spot instances" published in the International Symposium on Advances of Distributed Computing and Networking (ADCN 2011).Acknowledgment This research is supported in part by the National Science Foundation grant CNS 0958854 and educational resource grants from Amazon.com.
文摘Cloud computing is expanding widely in the world of IT infrastructure. This is due partly to the cost-saving effect of economies of scale. Fair market conditions can in theory provide a healthy environment to reflect the most reasonable costs of computations. While fixed cloud pricing provides an attractive low entry barrier for compute-intensive applications, both the consumer and supplier of computing resources can see high efficiency for their investments by participating in auction-based exchanges. There are huge incentives for the cloud provider to offer auctioned resources. However, from the consumer perspective, using these resources is a sparsely discussed challenge. This paper reports a methodology and framework designed to address the challenges of using HPC (High Performance Computing) applications on auction-based cloud clusters. The authors focus on HPC applications and describe a method for determining bid-aware checkpointing intervals. They extend a theoretical model for determining checkpoint intervals using statistical analysis of pricing histories. Also the latest developments in the SpotHPC framework are introduced which aim at facilitating the managed execution of real MPI applications on auction-based cloud environments. The authors use their model to simulate a set of algorithms with different computing and communication densities. The results show the complex interactions between optimal bidding strategies and parallel applications performance.
基金Financial supports from the Natural Science Foundation of China(41761134089,41977218)Six Talent Peaks Project of Jiangsu Province(RJFW-003)the Fundamental Research Funds for the Central Universities(14380103)are gratefully acknowledged.
文摘Discrete element method can effectively simulate the discontinuity,inhomogeneity and large deformation and failure of rock and soil.Based on the innovative matrix computing of the discrete element method,the highperformance discrete element software MatDEM may handle millions of elements in one computer,and enables the discrete element simulation at the engineering scale.It supports heat calculation,multi-field and fluidsolid coupling numerical simulations.Furthermore,the software integrates pre-processing,solver,postprocessing,and powerful secondary development,allowing recompiling new discrete element software.The basic principles of the DEM,the implement and development of the MatDEM software,and its applications are introduced in this paper.The software and sample source code are available online(http://matdem.com).
文摘Factors that have effect on concrete creep include mixture composition,curing conditions,ambient exposure conditions,and element geometry.Considering concrete mixtures influence and in order to improve the prediction of prestress loss in important structures,an experimental test under laboratory conditions was carried out to investigate compression creep of two high performance concrete mixtures used for prestressed members in one bridge.Based on the experimental results,a power exponent function of creep degree for structural numerical analysis was used to model the creep degree of two HPCs,and two series of parameters of this function for two HPCs were calculated with evolution program optimum method.The experimental data was compared with CEB-FIP 90 and ACI 209(92) models,and the two code models both overestimated creep degrees of the two HPCs.So it is recommended that the power exponent function should be used in this bridge structure analysis.
文摘In order to investigate the compression creep of two kinds of high-performance concrete mixtures used for prestressed members in a bridge,an experimental test under laboratory conditions was carried out.Based on the experimental results,a power exponent function was used to model the creep degree of these high-performance concretes(HPCs) for structural numerical analysis,and two series parameters of this function for the HPCs were given with the optimum method of evolution program.The experimental data were compared with CEB-FIP 90 and ACI 92 models.Results show that the two code models both overestimate the creep degree of two HPCs,so it is recommended that the power exponent function should be used for the creep analysis of bridge structure.
文摘The meteorological high-performance computing resource is the support platform for the weather forecast and climate prediction numerical model operation. The scientific and objective method to evaluate the application of meteorological high-performance computing resources can not only provide reference for the optimization of active resources, but also provide a quantitative basis for future resource construction and planning. In this paper, the concept of the utility value B and index compliance rate E of the meteorological high performance computing system are presented. The evaluation process, evaluation index and calculation method of the high performance computing resource application benefits are introduced.
文摘Within the last few decades, increases in computational resources have contributed enormously to the progress of science and engineering (S & E). To continue making rapid advancements, the S & E community must be able to access computing resources. One way to provide such resources is through High-Performance Computing (HPC) centers. Many academic research institutions offer their own HPC Centers but struggle to make the computing resources easily accessible and user-friendly. Here we present SHABU, a RESTful Web API framework that enables S & E communities to access resources from Boston University’s Shared Computing Center (SCC). The SHABU requirements are derived from the use cases described in this work.
基金Project supported by the UK NERC SINATRA Project(Grant No.NE/K008781/1)
文摘In the last two decades, computational hydraulics has undergone a rapid development following the advancement of data acquisition and computing technologies. Using a finite-volume Godunov-type hydrodynamic model, this work demonstrates the promise of modern high-performance computing technology to achieve real-time flood modeling at a regional scale. The software is implemented for high-performance heterogeneous computing using the OpenCL programming framework, and developed to support simulations across multiple GPUs using a domain decomposition technique and across multiple systems through an efficient implementation of the Message Passing Interface (MPI) standard. The software is applied for a convective storm induced flood event in Newcastle upon Tyne, demonstrating high computational performance across a GPU cluster, and good agreement against crowd- sourced observations. Issues relating to data availability, complex urban topography and differences in drainage capacity affect results for a small number of areas.
基金This work was supported in part by the MOST,China under Grant Nos.2009CB723906 and 2008AA12Z109by CAS under Grant No.KZCX2-YW-313.
文摘The study of global climate change seeks to understand:(1)the components of the Earth’s varying environmental system,with a particular focus on climate;(2)how these components interact to determine present conditions;(3)the factors driving these components;(4)the history of global change and the projection of future change;and(5)how knowledge about global environmental variability and change can be applied to present-day and future decision-making.This paper addresses the use of high-performance computing and high-throughput computing for a global change study on the Digital Earth(DE)platform.Two aspects of the use of high-performance computing(HPC)/high-throughput computing(HTC)on the DE platform are the processing of data from all sources,especially Earth observation data,and the simulation of global change models.The HPC/HTC is an essential and efficient tool for the processing of vast amounts of global data,especially Earth observation data.The current trend involves running complex global climate models using potentially millions of personal computers to achieve better climate change predictions than would ever be possible using the supercomputers currently available to scientists.
基金This work has also been partially funded by the German Research Foundation(DFG)through the German Priority Programme 1648"Software for Exascale Computing"(SPPEXA)and the ADA-FS project,and by the European Union's Horizon 2020 Research and Innovation Program under the NEXTGenIO Project under Grant No.671591the Spanish Ministry of Science and Innovation under Contract No.TIN2015-65316+3 种基金the Generalitat de Catalunya under Contract No.2014-SGR-1051This work was performed under the auspices of the U.S.Department of Energy by Lawrence Livermore National Laboratory under Contract No.DE-AC52-07NA27344(LLNL-JRNL-779789)also supported by the U.S.Department of Energy,Office of Science,Advanced Scientific Computing Research,under Contract No.DE-AC02-06CH11357This work is also supported in part by the National Science Foundation of USA under Grant Nos.1561041,1564647,1744336,1763547,and 1822737.
文摘Storage backends of parallel compute clusters are still based mostly on magnetic disks,while newer and faster storage technologies such as flash-based SSDs or non-volatile random access memory(NVRAM)are deployed within compute nodes.Including these new storage technologies into scientific workflows is unfortunately today a mostly manual task,and most scientists therefore do not take advantage of the faster storage media.One approach to systematically include nodelocal SSDs or NVRAMs into scientific workflows is to deploy ad hoc file systems over a set of compute nodes,which serve as temporary storage systems for single applications or longer-running campaigns.This paper presents results from the Dagstuhl Seminar 17202"Challenges and Opportunities of User-Level File Systems for HPC"and discusses application scenarios as well as design strategies for ad hoc file systems using node-local storage media.The discussion includes open research questions,such as how to couple ad hoc file systems with the batch scheduling environment and how to schedule stage-in and stage-out processes of data between the storage backend and the ad hoc file systems.Also presented are strategies to build ad hoc file systems by using reusable components for networking and how to improve storage device compatibility.Various interfaces and semantics are presented,for example those used by the three ad hoc file systems BeeOND,GekkoFS,and BurstFS.Their presentation covers a range from file systems running in production to cutting-edge research focusing on reaching the performance limits of the underlying devices.
基金This work was supported by the National key R&D Program of China(2018YFB0203901)the National Natural Science Foundation of China under(Grant No.61772053)the fund of the State Key Laboratory of Software Development Environment(SKLSDE-2020ZX15).
文摘Wide-area high-performance computing is widely used for large-scale parallel computing applications owing to its high computing and storage resources.However,the geographical distribution of computing and storage resources makes efficient task distribution and data placement more challenging.To achieve a higher system performance,this study proposes a two-level global collaborative scheduling strategy for wide-area high-performance computing environments.The collaborative scheduling strategy integrates lightweight solution selection,redundant data placement and task stealing mechanisms,optimizing task distribution and data placement to achieve efficient computing in wide-area environments.The experimental results indicate that compared with the state-of-the-art collaborative scheduling algorithm HPS+,the proposed scheduling strategy reduces the makespan by 23.24%,improves computing and storage resource utilization by 8.28%and 21.73%respectively,and achieves similar global data migration costs.
基金This work is in part supported by the Director,Office of Advanced Scientific Computing Research,Office of Science,of the U.S.Department of Energy under Contract No.DE-AC02-06CH11357in part supported by the Exascale Computing Project under Grant No.17-SC-20-SC+1 种基金a joint project of the U.S.Department of Energy's Office of Science and National Nuclear Security Administration,responsible for delivering a capable exascale ecosystem,including software,applications,and hardware technology,to support the nation's exascale computing imperativein part supported by the U.S.Department of Energy,Office of Science,Office of Advanced Scientific Computing Research,Scientific Discovery through Advanced Computing(SciDAC)program.
文摘Technology enhancements and the growing breadth of application workflows running on high-performance computing(HPC)platforms drive the development of new data services that provide high performance on these new platforms,provide capable and productive interfaces and abstractions for a variety of applications,and are readily adapted when new technologies are deployed.The Mochi framework enables composition of specialized distributed data services from a collection of connectable modules and subservices.Rather than forcing all applications to use a one-size-fits-all data staging and I/O software configuration,Mochi allows each application to use a data service specialized to its needs and access patterns.This paper introduces the Mochi framework and methodology.The Mochi core components and microservices are described.Examples of the application of the Mochi methodology to the development of four specialized services are detailed.Finally,a performance evaluation of a Mochi core component,a Mochi microservice,and a composed service providing an object model is performed.The paper concludes by positioning Mochi relative to related work in the HPC space and indicating directions for future work.