摘要
对于大规模稀疏线性代数方程组,代数多重网格(AMG)是具有最优计算复杂度的求解算法,但由于其算法流程复杂,导致难以取得理想的并行可扩展性能,难以定位和分析其并行可扩展瓶颈。通过分析AMG算法的性能骨架和通信模式,归纳了三类可扩展性能瓶颈,并引入稀疏矩阵通信域的概念来刻画稀疏模式对并行通信性能的影响。针对辐射流体力学、结构力学、航空发动机三类实际应用的6个具有不同稀疏模式特征的典型算例,实现了多粒度并行可扩展性能瓶颈的定位与分析,总结了未来AMG并行性能优化方向。
Algebraic multigrid(AMG)is an optimal algorithm for solving large-scale sparse linear systems.However,its complexity makes it challenging to achieve ideal parallel scalability and identify parallel scalability bottlenecks.In this paper,we analyze the performance skeletons and communication patterns of the AMG algorithm to identify three categories of scalability bottlenecks.Additionally,we introduce the concept of the sparse matrix communication domain to characterize the influence of sparse patterns on parallel communication performance.We examine six typical examples with varying sparse pattern features in practical applications such as radiation fluid dynamics,structural mechanics,and aero-engines.Through our analysis,we identify and analyze multi-granularity parallel scalability bottlenecks and provide insights into future directions for improving AMG parallel performance.
作者
毛润彰
杜皓
田鸿运
黄思路
张鹏
徐小文
MAO Runzhang;DU Hao;TIAN Hongyun;HUANG Silu;ZHANG Peng;XU Xiaowen(Graduate School of China Academy of Engineering Physics,Beijing 100088,China;Institute of Applied Physics and Computational Mathematics,Beijing 100094,China;Kuang Yaming Honors School of Nanjing University,Nanjing,Jiangsu 210023,China;Software Center for High Performance Numerical Simulation,China Academy of Engineering Physics,Beijing 100088,China)
出处
《计算物理》
CSCD
北大核心
2024年第4期403-417,共15页
Chinese Journal of Computational Physics
基金
国家自然科学基金项目(62032023)资助。
关键词
代数多重网格
并行预条件算法
并行可扩展性
性能分析
性能瓶颈
algebraic multigrid
parallel preconditioning algorithms
parallel scalability
performance analysis
performance bottleneck