The availability of computers and communication networks allows us to gather and analyse data on a far larger scale than previously. At present, it is believed that statistics is a suitable method to analyse networks ...The availability of computers and communication networks allows us to gather and analyse data on a far larger scale than previously. At present, it is believed that statistics is a suitable method to analyse networks with millions, or more, of vertices. The MATLAB language, with its mass of statistical functions, is a good choice to rapidly realize an algorithm prototype of complex networks. The performance of the MATLAB codes can be further improved by using graphic processor units (GPU). This paper presents the strategies and performance of the GPU implementation of a complex networks package, and the Jacket toolbox of MATLAB is used. Compared with some commercially available CPU implementations, GPU can achieve a speedup of, on average, 11.3x. The experimental result proves that the GPU platform combined with the MATLAB language is a good combination for complex network research.展开更多
Based on the three-dimensional particle-in-cell (PIC) method and Compute Unified Device Architecture (CUDA), a parallel particle simulation code combined with a graphic processor unit (GPU) has been developed fo...Based on the three-dimensional particle-in-cell (PIC) method and Compute Unified Device Architecture (CUDA), a parallel particle simulation code combined with a graphic processor unit (GPU) has been developed for the simulation of charge-exchange (CEX) xenon ions in the plume of an ion thruster. Using the proposed technique, the potential and CEX plasma distribution are calculated for the ion thruster plume surrounding the DS1 spacecraft at different thrust levels. The simulation results are in good agreement with measured CEX ion parameters reported in literature, and the CPU's results are equal to a CPU's. Compared with a single CPU Intel Core 2 E6300, 16-processor GPU NVIDIA GeForce 9400 GT indicates a speedup factor of 3.6 when the total macro particle number is 1.1 × 10^6. The simulation results also reveal how the back flow CEX plasma affects the spacecraft floating potential, which indicates that the plume of the ion thruster is indeed able to alleviate the extreme negative floating potentials of spacecraft in geosynchronous orbit.展开更多
分子动力学(MD)模拟是研究硅纳米薄膜热力学性质的主要方法,但存在数据处理量大、计算密集、原子间作用模型复杂等问题,限制了MD模拟的深入应用。针对晶硅分子动力学模拟算法中数据访问不连续和大量分支判断造成并行资源浪费、线程等待...分子动力学(MD)模拟是研究硅纳米薄膜热力学性质的主要方法,但存在数据处理量大、计算密集、原子间作用模型复杂等问题,限制了MD模拟的深入应用。针对晶硅分子动力学模拟算法中数据访问不连续和大量分支判断造成并行资源浪费、线程等待等问题,结合Nvidia Tesla V100 GPU硬件体系结构特点,对晶硅MD模拟算法进行设计。通过全局内存的合并访存、循环展开、原子操作等优化方法,利用GPU强大并行计算和浮点运算能力,减少显存访问及算法执行过程中的分支冲突和判断指令,提升算法整体计算性能。测试结果表明,优化后的晶硅MD模拟算法的计算速度相比于优化前提升了1.69~1.97倍,相比于国际上主流的GPU加速MD模拟软件HOOMDblue和LAMMPS分别提升了3.20~3.47倍和17.40~38.04倍,具有较好的模拟加速效果。展开更多
基金Project supported by the Science Fund for Creative Research Groups of the National Natural Science Foundation of China (Grant No.60921062)the National Natural Science Foundation of China (Grant No.60873014)the Young Scientists Fund of the National Natural Science Foundation of China (Grant Nos.61003082 and 60903059)
文摘The availability of computers and communication networks allows us to gather and analyse data on a far larger scale than previously. At present, it is believed that statistics is a suitable method to analyse networks with millions, or more, of vertices. The MATLAB language, with its mass of statistical functions, is a good choice to rapidly realize an algorithm prototype of complex networks. The performance of the MATLAB codes can be further improved by using graphic processor units (GPU). This paper presents the strategies and performance of the GPU implementation of a complex networks package, and the Jacket toolbox of MATLAB is used. Compared with some commercially available CPU implementations, GPU can achieve a speedup of, on average, 11.3x. The experimental result proves that the GPU platform combined with the MATLAB language is a good combination for complex network research.
基金supported by National Natural Science Foundation of China (No. 10805004)Foundation of National Key Lab. of Science and Technology on Vacuum & Cryogenic of China (No. 9140C550404100C55)
文摘Based on the three-dimensional particle-in-cell (PIC) method and Compute Unified Device Architecture (CUDA), a parallel particle simulation code combined with a graphic processor unit (GPU) has been developed for the simulation of charge-exchange (CEX) xenon ions in the plume of an ion thruster. Using the proposed technique, the potential and CEX plasma distribution are calculated for the ion thruster plume surrounding the DS1 spacecraft at different thrust levels. The simulation results are in good agreement with measured CEX ion parameters reported in literature, and the CPU's results are equal to a CPU's. Compared with a single CPU Intel Core 2 E6300, 16-processor GPU NVIDIA GeForce 9400 GT indicates a speedup factor of 3.6 when the total macro particle number is 1.1 × 10^6. The simulation results also reveal how the back flow CEX plasma affects the spacecraft floating potential, which indicates that the plume of the ion thruster is indeed able to alleviate the extreme negative floating potentials of spacecraft in geosynchronous orbit.
文摘分子动力学(MD)模拟是研究硅纳米薄膜热力学性质的主要方法,但存在数据处理量大、计算密集、原子间作用模型复杂等问题,限制了MD模拟的深入应用。针对晶硅分子动力学模拟算法中数据访问不连续和大量分支判断造成并行资源浪费、线程等待等问题,结合Nvidia Tesla V100 GPU硬件体系结构特点,对晶硅MD模拟算法进行设计。通过全局内存的合并访存、循环展开、原子操作等优化方法,利用GPU强大并行计算和浮点运算能力,减少显存访问及算法执行过程中的分支冲突和判断指令,提升算法整体计算性能。测试结果表明,优化后的晶硅MD模拟算法的计算速度相比于优化前提升了1.69~1.97倍,相比于国际上主流的GPU加速MD模拟软件HOOMDblue和LAMMPS分别提升了3.20~3.47倍和17.40~38.04倍,具有较好的模拟加速效果。