The Kinetic Monte Carlo(KMC)is one of the commonly used methods for simulating radiation damage of materials.Our team develops a parallel KMC software named Crystal-KMC,which supports the Embedded Atom Method(EAM)pote...The Kinetic Monte Carlo(KMC)is one of the commonly used methods for simulating radiation damage of materials.Our team develops a parallel KMC software named Crystal-KMC,which supports the Embedded Atom Method(EAM)potential energy and utilizes the Message Passing Interface(MPI)technology to simulate the vacancy transition of the Copper(Cu)element under neutron radiation.To make better use of the computing power of modern supercomputers,we develop the parallel efficiency optimization model for the Crystal-KMC on Tianhe-2,to achieve a larger simulation of the damage process of materials under irradiation environment.Firstly,we analyze the performance bottleneck of the Crystal-KMC software and use the MIC offload statement to implement the operation of key modules of the software on the MIC coprocessor.We use Open MP to develop parallel optimization for the Crystal-KMC,combined with existing MPI inter-process communication optimization,finally achieving hybrid parallel optimization.The experimental results show that in the single-node CPU and MIC collaborative parallel mode,the speedup of the calculation hotspot reaches 30.1,and the speedup of the overall software reaches 7.43.展开更多
In this paper, we present the Tianhe-2 interconnect network and message passing services. We describe the architecture of the router and network interface chips, and highlight a set of hardware and software features e...In this paper, we present the Tianhe-2 interconnect network and message passing services. We describe the architecture of the router and network interface chips, and highlight a set of hardware and software features effectively supporting high performance communications, ranging over remote direct memory access, collective optimization, hardwareenable reliable end-to-end communication, user-level message passing services, etc. Measured hardware performance results are also presented.展开更多
Exascale computing is one of the major challenges of this decade,and several studies have shown that communications are becoming one of the bottlenecks for scaling parallel applications.The analysis on the characteris...Exascale computing is one of the major challenges of this decade,and several studies have shown that communications are becoming one of the bottlenecks for scaling parallel applications.The analysis on the characteristics of communications can effectively aid to improve the performance of scientific applications.In this paper,we focus on the statistical regularity in time-dimension communication characteristics for representative scientific applications on supercomputer systems,and then prove that the distribution of communication-event intervals has a power-law decay,which is common in scientific interests and human activities.We verify the distribution of communication-event intervals has really a power-law decay on the Tianhe-2 supercomputer,and also on the other six parallel systems with three different network topologies and two routing policies.In order to do a quantitative study on the power-law distribution,we exploit two groups of statistics:bursty vs.memory and periodicity vs.dispersion.Our results indicate that the communication events show a“strong-bursty and weak-memory”characteristic and the communication event intervals show the periodicity and the dispersion.Finally,our research provides an insight into the relationship between communication optimizations and time-dimension communication characteristics.展开更多
基金supported by the National Key R&D Program of China(No.2017YFB0202104)。
文摘The Kinetic Monte Carlo(KMC)is one of the commonly used methods for simulating radiation damage of materials.Our team develops a parallel KMC software named Crystal-KMC,which supports the Embedded Atom Method(EAM)potential energy and utilizes the Message Passing Interface(MPI)technology to simulate the vacancy transition of the Copper(Cu)element under neutron radiation.To make better use of the computing power of modern supercomputers,we develop the parallel efficiency optimization model for the Crystal-KMC on Tianhe-2,to achieve a larger simulation of the damage process of materials under irradiation environment.Firstly,we analyze the performance bottleneck of the Crystal-KMC software and use the MIC offload statement to implement the operation of key modules of the software on the MIC coprocessor.We use Open MP to develop parallel optimization for the Crystal-KMC,combined with existing MPI inter-process communication optimization,finally achieving hybrid parallel optimization.The experimental results show that in the single-node CPU and MIC collaborative parallel mode,the speedup of the calculation hotspot reaches 30.1,and the speedup of the overall software reaches 7.43.
基金supported by the National Natural Science Foundation of China[grant number 41675100],[grant number91337110]the Third Tibetan Plateau Scientific Experiment:Observations for Boundary Layer and Troposphere[GYHY201406001]+1 种基金the Key Research Program of Frontier Sciences,Chinese Academy of Science(CAS)(QYZDY-SSW-DQC018)the Special Program for Applied Research on Super Computation of the NSFC-Guangdong Joint Fund(the 2nd phase)
基金This work was partially supported by the National High Technology Research and Development 863 Program of China under Grant No. 2012AA01A301 and the National Natural Science Foundation of China under Grant No. 61120106005. Acknowledgements The Tianhe-2 project is a great team effort and benefits from the cooperation of many individuals at NUDT. We would like to thank the entire Tianhe-2 development, applications, and bench- marking teams, and all the people who have contributed to the system in a variety of ways.
文摘In this paper, we present the Tianhe-2 interconnect network and message passing services. We describe the architecture of the router and network interface chips, and highlight a set of hardware and software features effectively supporting high performance communications, ranging over remote direct memory access, collective optimization, hardwareenable reliable end-to-end communication, user-level message passing services, etc. Measured hardware performance results are also presented.
基金funding from the National Key Research and Development Program of China(2017YFB0202200)the Advanced Research Project of China(31511010203)+1 种基金Open Fund(201503-02)from State Key Laboratory of High Performance Computing,and Research Program of NUDT(ZK18-03-10).
文摘Exascale computing is one of the major challenges of this decade,and several studies have shown that communications are becoming one of the bottlenecks for scaling parallel applications.The analysis on the characteristics of communications can effectively aid to improve the performance of scientific applications.In this paper,we focus on the statistical regularity in time-dimension communication characteristics for representative scientific applications on supercomputer systems,and then prove that the distribution of communication-event intervals has a power-law decay,which is common in scientific interests and human activities.We verify the distribution of communication-event intervals has really a power-law decay on the Tianhe-2 supercomputer,and also on the other six parallel systems with three different network topologies and two routing policies.In order to do a quantitative study on the power-law distribution,we exploit two groups of statistics:bursty vs.memory and periodicity vs.dispersion.Our results indicate that the communication events show a“strong-bursty and weak-memory”characteristic and the communication event intervals show the periodicity and the dispersion.Finally,our research provides an insight into the relationship between communication optimizations and time-dimension communication characteristics.