This paper describes parallel simulation of the memory/computation-intensive acoustic wave equation with CPU template buffer optimization. Considering the 8-core CPU shared storage platform as an example,we obtain a o...This paper describes parallel simulation of the memory/computation-intensive acoustic wave equation with CPU template buffer optimization. Considering the 8-core CPU shared storage platform as an example,we obtain a one-time speed-up ratio of 6.7× compared with the serial program by using a coarse-grained OpenMP parallel scheme. Then,data is vectorized on the template buffer using the single instruction-multiple data(SIMD) technique to further exploit the computing potential of the CPUs. We apply an 8-channel parallel vector to simulate seismic wavefields with the 256-bit advanced vector extensions(AVX) instruction set. This increases the computing bandwidth,thus eliminating a significant volume of the computing instructions and obtaining a secondary speed-up ratio of 3–7×. In addition,we use 32-byte data alignment,shortest data direction vectorization,and loop tiling optimization algorithm to achieve faster program execution. Finally,we analyze the factors affecting the secondary speed-up of AVX through three-dimensional modeling experiments with the salt model.The results indicate that the memory,cache,and register can better cooperate with each other and the speed-up is increased by optimizing the AVX algorithm.展开更多
基金funded by the National Natural Science Foundation of China (No. 41274140)the National Science and Technology Major Projects of China (No. 2017ZX05035003-001)
文摘This paper describes parallel simulation of the memory/computation-intensive acoustic wave equation with CPU template buffer optimization. Considering the 8-core CPU shared storage platform as an example,we obtain a one-time speed-up ratio of 6.7× compared with the serial program by using a coarse-grained OpenMP parallel scheme. Then,data is vectorized on the template buffer using the single instruction-multiple data(SIMD) technique to further exploit the computing potential of the CPUs. We apply an 8-channel parallel vector to simulate seismic wavefields with the 256-bit advanced vector extensions(AVX) instruction set. This increases the computing bandwidth,thus eliminating a significant volume of the computing instructions and obtaining a secondary speed-up ratio of 3–7×. In addition,we use 32-byte data alignment,shortest data direction vectorization,and loop tiling optimization algorithm to achieve faster program execution. Finally,we analyze the factors affecting the secondary speed-up of AVX through three-dimensional modeling experiments with the salt model.The results indicate that the memory,cache,and register can better cooperate with each other and the speed-up is increased by optimizing the AVX algorithm.