摘要
利用ARMv8的SIMD指令,面向国产飞腾处理器,实现了一种多时间步部分网格推进的一维对流方程加速求解汇编算法。与通过C语言实现的显式时间步推进算法相比,主要有两个优点:1)在相同网格计算量条件下,大量减少时间步长维度上的访问延迟开销;2)在相同时间步的计算条件下,充分利用了SIMD指令降低了计算延迟开销。在国产飞腾CPU上进行了数值实验和性能评估,计算实践表明,在单线程计算中,在使用20个浮点寄存器进行浮点运算的情况下,优化算法最高计算速度是经过编译优化后的一般数值求解算法的4.35倍,显著地提高了串行计算的效率。
Using SIMD instruction of ARMv8,an assembly algorithm for accelerating the solution of one-dimensional convective equations with multi-time step partial grid advancement is implemented for the domestic Phytium processor.Compared with the explicit time-stepping algorithm implemented by C language,this algorithm has two main advantages:1)Under the condition of the same grid calculation amount,it greatly reduces the access delay overhead in the time step dimension;2)Under the calculation conditions of the same time step,the SIMD instruction is fully utilized to reduce the calculation delay overhead.Numerical experiments and performance evaluations were carried out on the domestic Phytium CPU.The calculation practice shows that in single-thread calculation,when 20 floating-point registers are used for floating-point operations,the maximum calculation speed of the optimization algorithm is 4.35 times that of the general numerical solution algorithm after compilation and optimization,which significantly improves the efficiency of serial calculation.
作者
廖逸枭
邵立松
王光学
郑敏
LIAO Yi-xiao;SHAO Li-song;WANG Guang-xue;ZHENG Min(Sun Yat-sen University,Shenzhen 518000,China;Phytium Information Technology Co.,Ltd.,Tianjin 300000,China)
出处
《航空计算技术》
2023年第3期35-39,共5页
Aeronautical Computing Technique
基金
国家重大项目资助(GJXM92579)。
关键词
一维对流方程
SIMD
飞腾处理器
单线程
加速计算
one-dimensional convection equation
SIMD
phytium processor
single thread
accelerated computing