摘要
迭代方法是科学计算中求解大规模稀疏线性代数方程组最常用的方法.大量实际应用表明,迭代方法通常具有较高的通信与计算比,只有在粗粒度并行下才能取得较好的并行可扩展性能.而实际应用大规模计算的需求和当前多核/众核体系结构的发展趋势要求迭代方法具备细粒度并行可扩展能力.文中引入渐近规模,即满足加速条件的计算规模下界,来反映并行迭代方法适应细粒度并行的能力,并由此刻画通信与计算比.基于矩阵的稀疏模式及其通信模式、机器的通信参数和迭代方法的基本运算,给出了渐近规模的理论预测公式.在一台包含128个双路4核计算节点的并行机上,分别基于纯进程并行(MPI)和进程/线程混合并行(MPI/OpenMP),以实际应用中3种常用迭代方法Jacobi、CG、BiCGSTAB为例,分析其渐近规模.并行可扩展性测试表明了渐近规模用于刻画迭代方法通信与计算比的准确性.对于纯进程情形,给出了渐近规模的理论预测与实际测试的对比,表明了理论预测结果的正确性.最后,基于这些结果,从迭代方法的算法设计和并行实现等方面讨论了面向未来更大规模的计算系统,降低通信与计算比的途径.
Iterative method are one of the most efficient numerical algorithms to solve large-scale sparse linear systems arising from scientific computing. The parallel scalability of iterative method can be measured by the communication-to-computation (CtC) during the iterative process. The CtC is high for many iterative methods, as a result, coarse-grain parallelism is needed in order to obtain expected scalability. However, fine-grain parallelism is required for the complicacy of architecture increasing with multi/many-cores. In this paper, we introduce the concept of asymptotic size, which defined as the low bound of the problem size such that satisfying the speedup condition that parallel speedup is more than 1. We hope the asymptotic size can be used to describe the CtC and the ability of fine-grain parallelism for iterative method. Moreover, the theoretical prediction formula of asymptotic size is obtained based on the following parameters, the sparse and communication pattern of matrix, communication parameters of machine, and combination of basic operations of iterative methods. Using asymptotic size, the CtC is analyzed for three popular iterative methods, including Jacobi, CG, BiCGSTAB, on a MPP machine with 128 Double Quad-core nodes. The performance results are given for both MPI-Only and MPI/ OpenMP programming model, which show the usefulness of the asymptotic size for describing the CtC of iterative methods. For MPI-Only case, we also give the comparison of the prediction results and experiment results, which show the validation of the formula of asymptotic size. Finally, future research topics for improving the scalability of the iterative methods on more power supercomputers also discussed based on the analysis conclusions.
出处
《计算机学报》
EI
CSCD
北大核心
2013年第4期782-789,共8页
Chinese Journal of Computers
基金
国家"九七三"重点基础研究发展规划项目基金(2011CB309702)
国家"八六三"高技术研究发展计划项目基金(2012AA01A309)资助~~
关键词
迭代方法
通信与计算比
并行可扩展
渐近规模
多核
众核体系结构
iterative method
communication-to-computation parallel scalability asymptoticsize
multi/many-cores