Discrete element method can effectively simulate the discontinuity,inhomogeneity and large deformation and failure of rock and soil.Based on the innovative matrix computing of the discrete element method,the highperfo...Discrete element method can effectively simulate the discontinuity,inhomogeneity and large deformation and failure of rock and soil.Based on the innovative matrix computing of the discrete element method,the highperformance discrete element software MatDEM may handle millions of elements in one computer,and enables the discrete element simulation at the engineering scale.It supports heat calculation,multi-field and fluidsolid coupling numerical simulations.Furthermore,the software integrates pre-processing,solver,postprocessing,and powerful secondary development,allowing recompiling new discrete element software.The basic principles of the DEM,the implement and development of the MatDEM software,and its applications are introduced in this paper.The software and sample source code are available online(http://matdem.com).展开更多
Factors that have effect on concrete creep include mixture composition,curing conditions,ambient exposure conditions,and element geometry.Considering concrete mixtures influence and in order to improve the prediction ...Factors that have effect on concrete creep include mixture composition,curing conditions,ambient exposure conditions,and element geometry.Considering concrete mixtures influence and in order to improve the prediction of prestress loss in important structures,an experimental test under laboratory conditions was carried out to investigate compression creep of two high performance concrete mixtures used for prestressed members in one bridge.Based on the experimental results,a power exponent function of creep degree for structural numerical analysis was used to model the creep degree of two HPCs,and two series of parameters of this function for two HPCs were calculated with evolution program optimum method.The experimental data was compared with CEB-FIP 90 and ACI 209(92) models,and the two code models both overestimated creep degrees of the two HPCs.So it is recommended that the power exponent function should be used in this bridge structure analysis.展开更多
The meteorological high-performance computing resource is the support platform for the weather forecast and climate prediction numerical model operation. The scientific and objective method to evaluate the application...The meteorological high-performance computing resource is the support platform for the weather forecast and climate prediction numerical model operation. The scientific and objective method to evaluate the application of meteorological high-performance computing resources can not only provide reference for the optimization of active resources, but also provide a quantitative basis for future resource construction and planning. In this paper, the concept of the utility value B and index compliance rate E of the meteorological high performance computing system are presented. The evaluation process, evaluation index and calculation method of the high performance computing resource application benefits are introduced.展开更多
Within the last few decades, increases in computational resources have contributed enormously to the progress of science and engineering (S & E). To continue making rapid advancements, the S & E community must...Within the last few decades, increases in computational resources have contributed enormously to the progress of science and engineering (S & E). To continue making rapid advancements, the S & E community must be able to access computing resources. One way to provide such resources is through High-Performance Computing (HPC) centers. Many academic research institutions offer their own HPC Centers but struggle to make the computing resources easily accessible and user-friendly. Here we present SHABU, a RESTful Web API framework that enables S & E communities to access resources from Boston University’s Shared Computing Center (SCC). The SHABU requirements are derived from the use cases described in this work.展开更多
The Message Passing Interface (MPI) is a widely accepted standard for parallel computing on distributed memorysystems.However, MPI implementations can contain defects that impact the reliability and performance of par...The Message Passing Interface (MPI) is a widely accepted standard for parallel computing on distributed memorysystems.However, MPI implementations can contain defects that impact the reliability and performance of parallelapplications. Detecting and correcting these defects is crucial, yet there is a lack of published models specificallydesigned for correctingMPI defects. To address this, we propose a model for detecting and correcting MPI defects(DC_MPI), which aims to detect and correct defects in various types of MPI communication, including blockingpoint-to-point (BPTP), nonblocking point-to-point (NBPTP), and collective communication (CC). The defectsaddressed by the DC_MPI model include illegal MPI calls, deadlocks (DL), race conditions (RC), and messagemismatches (MM). To assess the effectiveness of the DC_MPI model, we performed experiments on a datasetconsisting of 40 MPI codes. The results indicate that the model achieved a detection rate of 37 out of 40 codes,resulting in an overall detection accuracy of 92.5%. Additionally, the execution duration of the DC_MPI modelranged from 0.81 to 1.36 s. These findings show that the DC_MPI model is useful in detecting and correctingdefects in MPI implementations, thereby enhancing the reliability and performance of parallel applications. TheDC_MPImodel fills an important research gap and provides a valuable tool for improving the quality ofMPI-basedparallel computing systems.展开更多
Hyperparameter tuning is a key step in developing high-performing machine learning models, but searching large hyperparameter spaces requires extensive computation using standard sequential methods. This work analyzes...Hyperparameter tuning is a key step in developing high-performing machine learning models, but searching large hyperparameter spaces requires extensive computation using standard sequential methods. This work analyzes the performance gains from parallel versus sequential hyperparameter optimization. Using scikit-learn’s Randomized SearchCV, this project tuned a Random Forest classifier for fake news detection via randomized grid search. Setting n_jobs to -1 enabled full parallelization across CPU cores. Results show the parallel implementation achieved over 5× faster CPU times and 3× faster total run times compared to sequential tuning. However, test accuracy slightly dropped from 99.26% sequentially to 99.15% with parallelism, indicating a trade-off between evaluation efficiency and model performance. Still, the significant computational gains allow more extensive hyperparameter exploration within reasonable timeframes, outweighing the small accuracy decrease. Further analysis could better quantify this trade-off across different models, tuning techniques, tasks, and hardware.展开更多
The study of global climate change seeks to understand:(1)the components of the Earth’s varying environmental system,with a particular focus on climate;(2)how these components interact to determine present conditions...The study of global climate change seeks to understand:(1)the components of the Earth’s varying environmental system,with a particular focus on climate;(2)how these components interact to determine present conditions;(3)the factors driving these components;(4)the history of global change and the projection of future change;and(5)how knowledge about global environmental variability and change can be applied to present-day and future decision-making.This paper addresses the use of high-performance computing and high-throughput computing for a global change study on the Digital Earth(DE)platform.Two aspects of the use of high-performance computing(HPC)/high-throughput computing(HTC)on the DE platform are the processing of data from all sources,especially Earth observation data,and the simulation of global change models.The HPC/HTC is an essential and efficient tool for the processing of vast amounts of global data,especially Earth observation data.The current trend involves running complex global climate models using potentially millions of personal computers to achieve better climate change predictions than would ever be possible using the supercomputers currently available to scientists.展开更多
Technology enhancements and the growing breadth of application workflows running on high-performance computing(HPC)platforms drive the development of new data services that provide high performance on these new platfo...Technology enhancements and the growing breadth of application workflows running on high-performance computing(HPC)platforms drive the development of new data services that provide high performance on these new platforms,provide capable and productive interfaces and abstractions for a variety of applications,and are readily adapted when new technologies are deployed.The Mochi framework enables composition of specialized distributed data services from a collection of connectable modules and subservices.Rather than forcing all applications to use a one-size-fits-all data staging and I/O software configuration,Mochi allows each application to use a data service specialized to its needs and access patterns.This paper introduces the Mochi framework and methodology.The Mochi core components and microservices are described.Examples of the application of the Mochi methodology to the development of four specialized services are detailed.Finally,a performance evaluation of a Mochi core component,a Mochi microservice,and a composed service providing an object model is performed.The paper concludes by positioning Mochi relative to related work in the HPC space and indicating directions for future work.展开更多
Storage backends of parallel compute clusters are still based mostly on magnetic disks,while newer and faster storage technologies such as flash-based SSDs or non-volatile random access memory(NVRAM)are deployed withi...Storage backends of parallel compute clusters are still based mostly on magnetic disks,while newer and faster storage technologies such as flash-based SSDs or non-volatile random access memory(NVRAM)are deployed within compute nodes.Including these new storage technologies into scientific workflows is unfortunately today a mostly manual task,and most scientists therefore do not take advantage of the faster storage media.One approach to systematically include nodelocal SSDs or NVRAMs into scientific workflows is to deploy ad hoc file systems over a set of compute nodes,which serve as temporary storage systems for single applications or longer-running campaigns.This paper presents results from the Dagstuhl Seminar 17202"Challenges and Opportunities of User-Level File Systems for HPC"and discusses application scenarios as well as design strategies for ad hoc file systems using node-local storage media.The discussion includes open research questions,such as how to couple ad hoc file systems with the batch scheduling environment and how to schedule stage-in and stage-out processes of data between the storage backend and the ad hoc file systems.Also presented are strategies to build ad hoc file systems by using reusable components for networking and how to improve storage device compatibility.Various interfaces and semantics are presented,for example those used by the three ad hoc file systems BeeOND,GekkoFS,and BurstFS.Their presentation covers a range from file systems running in production to cutting-edge research focusing on reaching the performance limits of the underlying devices.展开更多
Due to current technology enhancement,molecular databases have exponentially grown requesting faster efficient methods that can handle these amounts of huge data.There-fore,Multi-processing CPUs technology can be used...Due to current technology enhancement,molecular databases have exponentially grown requesting faster efficient methods that can handle these amounts of huge data.There-fore,Multi-processing CPUs technology can be used including physical and logical processors(Hyper Threading)to significantly increase the performance of computations.Accordingly,sequence comparison and pairwise alignment were both found contributing significantly in calculating the resemblance between sequences for constructing optimal alignments.This research used the Hash Table-NGram-Hirschberg(HT-NGH)algo-rithm to represent this pairwise alignment utilizing hashing capabilities.The authors propose using parallel shared memory architecture via Hyper Threading to improve the performance of molecular dataset protein pairwise alignment.The proposed parallel hyper threading method targeted the transformation of the HT-NGH on the datasets decomposition for sequence level efficient utilization within the processing units,that is,reducing idle processing unit situations.The authors combined hyper threading within the multicore architecture processing on shared memory utilization remarking perfor-mance of 24.8%average speed up to 34.4%as the highest boosting rate.The benefit of this work improvement is shown preserving acceptable accuracy,that is,reaching 2.08,2.88,and 3.87 boost-up as well as the efficiency of 1.04,0.96,and 0.97,using 2,3,and 4 cores,respectively,as attractive remarkable results.展开更多
基金"This paper is an extended version of "SpotMPl: a framework for auction-based HPC computing using amazon spot instances" published in the International Symposium on Advances of Distributed Computing and Networking (ADCN 2011).Acknowledgment This research is supported in part by the National Science Foundation grant CNS 0958854 and educational resource grants from Amazon.com.
基金Financial supports from the Natural Science Foundation of China(41761134089,41977218)Six Talent Peaks Project of Jiangsu Province(RJFW-003)the Fundamental Research Funds for the Central Universities(14380103)are gratefully acknowledged.
文摘Discrete element method can effectively simulate the discontinuity,inhomogeneity and large deformation and failure of rock and soil.Based on the innovative matrix computing of the discrete element method,the highperformance discrete element software MatDEM may handle millions of elements in one computer,and enables the discrete element simulation at the engineering scale.It supports heat calculation,multi-field and fluidsolid coupling numerical simulations.Furthermore,the software integrates pre-processing,solver,postprocessing,and powerful secondary development,allowing recompiling new discrete element software.The basic principles of the DEM,the implement and development of the MatDEM software,and its applications are introduced in this paper.The software and sample source code are available online(http://matdem.com).
文摘Factors that have effect on concrete creep include mixture composition,curing conditions,ambient exposure conditions,and element geometry.Considering concrete mixtures influence and in order to improve the prediction of prestress loss in important structures,an experimental test under laboratory conditions was carried out to investigate compression creep of two high performance concrete mixtures used for prestressed members in one bridge.Based on the experimental results,a power exponent function of creep degree for structural numerical analysis was used to model the creep degree of two HPCs,and two series of parameters of this function for two HPCs were calculated with evolution program optimum method.The experimental data was compared with CEB-FIP 90 and ACI 209(92) models,and the two code models both overestimated creep degrees of the two HPCs.So it is recommended that the power exponent function should be used in this bridge structure analysis.
文摘The meteorological high-performance computing resource is the support platform for the weather forecast and climate prediction numerical model operation. The scientific and objective method to evaluate the application of meteorological high-performance computing resources can not only provide reference for the optimization of active resources, but also provide a quantitative basis for future resource construction and planning. In this paper, the concept of the utility value B and index compliance rate E of the meteorological high performance computing system are presented. The evaluation process, evaluation index and calculation method of the high performance computing resource application benefits are introduced.
文摘Within the last few decades, increases in computational resources have contributed enormously to the progress of science and engineering (S & E). To continue making rapid advancements, the S & E community must be able to access computing resources. One way to provide such resources is through High-Performance Computing (HPC) centers. Many academic research institutions offer their own HPC Centers but struggle to make the computing resources easily accessible and user-friendly. Here we present SHABU, a RESTful Web API framework that enables S & E communities to access resources from Boston University’s Shared Computing Center (SCC). The SHABU requirements are derived from the use cases described in this work.
基金the Deanship of Scientific Research at King Abdulaziz University,Jeddah,Saudi Arabia under the Grant No.RG-12-611-43.
文摘The Message Passing Interface (MPI) is a widely accepted standard for parallel computing on distributed memorysystems.However, MPI implementations can contain defects that impact the reliability and performance of parallelapplications. Detecting and correcting these defects is crucial, yet there is a lack of published models specificallydesigned for correctingMPI defects. To address this, we propose a model for detecting and correcting MPI defects(DC_MPI), which aims to detect and correct defects in various types of MPI communication, including blockingpoint-to-point (BPTP), nonblocking point-to-point (NBPTP), and collective communication (CC). The defectsaddressed by the DC_MPI model include illegal MPI calls, deadlocks (DL), race conditions (RC), and messagemismatches (MM). To assess the effectiveness of the DC_MPI model, we performed experiments on a datasetconsisting of 40 MPI codes. The results indicate that the model achieved a detection rate of 37 out of 40 codes,resulting in an overall detection accuracy of 92.5%. Additionally, the execution duration of the DC_MPI modelranged from 0.81 to 1.36 s. These findings show that the DC_MPI model is useful in detecting and correctingdefects in MPI implementations, thereby enhancing the reliability and performance of parallel applications. TheDC_MPImodel fills an important research gap and provides a valuable tool for improving the quality ofMPI-basedparallel computing systems.
文摘Hyperparameter tuning is a key step in developing high-performing machine learning models, but searching large hyperparameter spaces requires extensive computation using standard sequential methods. This work analyzes the performance gains from parallel versus sequential hyperparameter optimization. Using scikit-learn’s Randomized SearchCV, this project tuned a Random Forest classifier for fake news detection via randomized grid search. Setting n_jobs to -1 enabled full parallelization across CPU cores. Results show the parallel implementation achieved over 5× faster CPU times and 3× faster total run times compared to sequential tuning. However, test accuracy slightly dropped from 99.26% sequentially to 99.15% with parallelism, indicating a trade-off between evaluation efficiency and model performance. Still, the significant computational gains allow more extensive hyperparameter exploration within reasonable timeframes, outweighing the small accuracy decrease. Further analysis could better quantify this trade-off across different models, tuning techniques, tasks, and hardware.
基金This work was supported in part by the MOST,China under Grant Nos.2009CB723906 and 2008AA12Z109by CAS under Grant No.KZCX2-YW-313.
文摘The study of global climate change seeks to understand:(1)the components of the Earth’s varying environmental system,with a particular focus on climate;(2)how these components interact to determine present conditions;(3)the factors driving these components;(4)the history of global change and the projection of future change;and(5)how knowledge about global environmental variability and change can be applied to present-day and future decision-making.This paper addresses the use of high-performance computing and high-throughput computing for a global change study on the Digital Earth(DE)platform.Two aspects of the use of high-performance computing(HPC)/high-throughput computing(HTC)on the DE platform are the processing of data from all sources,especially Earth observation data,and the simulation of global change models.The HPC/HTC is an essential and efficient tool for the processing of vast amounts of global data,especially Earth observation data.The current trend involves running complex global climate models using potentially millions of personal computers to achieve better climate change predictions than would ever be possible using the supercomputers currently available to scientists.
基金This work is in part supported by the Director,Office of Advanced Scientific Computing Research,Office of Science,of the U.S.Department of Energy under Contract No.DE-AC02-06CH11357in part supported by the Exascale Computing Project under Grant No.17-SC-20-SC+1 种基金a joint project of the U.S.Department of Energy's Office of Science and National Nuclear Security Administration,responsible for delivering a capable exascale ecosystem,including software,applications,and hardware technology,to support the nation's exascale computing imperativein part supported by the U.S.Department of Energy,Office of Science,Office of Advanced Scientific Computing Research,Scientific Discovery through Advanced Computing(SciDAC)program.
文摘Technology enhancements and the growing breadth of application workflows running on high-performance computing(HPC)platforms drive the development of new data services that provide high performance on these new platforms,provide capable and productive interfaces and abstractions for a variety of applications,and are readily adapted when new technologies are deployed.The Mochi framework enables composition of specialized distributed data services from a collection of connectable modules and subservices.Rather than forcing all applications to use a one-size-fits-all data staging and I/O software configuration,Mochi allows each application to use a data service specialized to its needs and access patterns.This paper introduces the Mochi framework and methodology.The Mochi core components and microservices are described.Examples of the application of the Mochi methodology to the development of four specialized services are detailed.Finally,a performance evaluation of a Mochi core component,a Mochi microservice,and a composed service providing an object model is performed.The paper concludes by positioning Mochi relative to related work in the HPC space and indicating directions for future work.
基金This work has also been partially funded by the German Research Foundation(DFG)through the German Priority Programme 1648"Software for Exascale Computing"(SPPEXA)and the ADA-FS project,and by the European Union's Horizon 2020 Research and Innovation Program under the NEXTGenIO Project under Grant No.671591the Spanish Ministry of Science and Innovation under Contract No.TIN2015-65316+3 种基金the Generalitat de Catalunya under Contract No.2014-SGR-1051This work was performed under the auspices of the U.S.Department of Energy by Lawrence Livermore National Laboratory under Contract No.DE-AC52-07NA27344(LLNL-JRNL-779789)also supported by the U.S.Department of Energy,Office of Science,Advanced Scientific Computing Research,under Contract No.DE-AC02-06CH11357This work is also supported in part by the National Science Foundation of USA under Grant Nos.1561041,1564647,1744336,1763547,and 1822737.
文摘Storage backends of parallel compute clusters are still based mostly on magnetic disks,while newer and faster storage technologies such as flash-based SSDs or non-volatile random access memory(NVRAM)are deployed within compute nodes.Including these new storage technologies into scientific workflows is unfortunately today a mostly manual task,and most scientists therefore do not take advantage of the faster storage media.One approach to systematically include nodelocal SSDs or NVRAMs into scientific workflows is to deploy ad hoc file systems over a set of compute nodes,which serve as temporary storage systems for single applications or longer-running campaigns.This paper presents results from the Dagstuhl Seminar 17202"Challenges and Opportunities of User-Level File Systems for HPC"and discusses application scenarios as well as design strategies for ad hoc file systems using node-local storage media.The discussion includes open research questions,such as how to couple ad hoc file systems with the batch scheduling environment and how to schedule stage-in and stage-out processes of data between the storage backend and the ad hoc file systems.Also presented are strategies to build ad hoc file systems by using reusable components for networking and how to improve storage device compatibility.Various interfaces and semantics are presented,for example those used by the three ad hoc file systems BeeOND,GekkoFS,and BurstFS.Their presentation covers a range from file systems running in production to cutting-edge research focusing on reaching the performance limits of the underlying devices.
基金Deanship of Scientific Research(DSR),King Abdulaziz University,Grant/Award Number:D-139-137-1441。
文摘Due to current technology enhancement,molecular databases have exponentially grown requesting faster efficient methods that can handle these amounts of huge data.There-fore,Multi-processing CPUs technology can be used including physical and logical processors(Hyper Threading)to significantly increase the performance of computations.Accordingly,sequence comparison and pairwise alignment were both found contributing significantly in calculating the resemblance between sequences for constructing optimal alignments.This research used the Hash Table-NGram-Hirschberg(HT-NGH)algo-rithm to represent this pairwise alignment utilizing hashing capabilities.The authors propose using parallel shared memory architecture via Hyper Threading to improve the performance of molecular dataset protein pairwise alignment.The proposed parallel hyper threading method targeted the transformation of the HT-NGH on the datasets decomposition for sequence level efficient utilization within the processing units,that is,reducing idle processing unit situations.The authors combined hyper threading within the multicore architecture processing on shared memory utilization remarking perfor-mance of 24.8%average speed up to 34.4%as the highest boosting rate.The benefit of this work improvement is shown preserving acceptable accuracy,that is,reaching 2.08,2.88,and 3.87 boost-up as well as the efficiency of 1.04,0.96,and 0.97,using 2,3,and 4 cores,respectively,as attractive remarkable results.