由于SSOR预条件共轭梯度算法中预条件方程求解需要前推和回代,导致算法迁移到GPU平台上并行效率不高.为此,基于诺依曼多项式分解技术,提出了一种GPU加速的SSOR稀疏近似逆预条件子(GSSORSAI).它不仅保持了原线性系统系数矩阵的稀疏和对...由于SSOR预条件共轭梯度算法中预条件方程求解需要前推和回代,导致算法迁移到GPU平台上并行效率不高.为此,基于诺依曼多项式分解技术,提出了一种GPU加速的SSOR稀疏近似逆预条件子(GSSORSAI).它不仅保持了原线性系统系数矩阵的稀疏和对称正定特性,而且预条件方程求解仅需一次稀疏矩阵矢量乘运算,避免了前推和回代过程.实验结果表明:在NVIDIA Tesla C2050GPU上,对比使用Python在单个CPU上SSOR稀疏近似逆预条件子实现方法,GSSORSAI平均快将近100倍;应用到并行的PCG算法中,相比无预条件的CG算法,平均提高了算法的3倍的收敛速度.展开更多
This paper presents a new CG-type algorithm for solving large linear systems. It isobtained from a subclass of the ABS algorithm-Voyevodin’s CG method by choosingthe parameter matrix B in some special ways. Having pr...This paper presents a new CG-type algorithm for solving large linear systems. It isobtained from a subclass of the ABS algorithm-Voyevodin’s CG method by choosingthe parameter matrix B in some special ways. Having preconditioning properties, thematrix B makes the new algorithm converge fast. The convergence analysis is given.Several ways for choosing B, which are similar to the polynomial preconditioing, arediscussed. Numerical tests indicate that the new algorithm is effective and competitive.Besides, it is suitable for parallel architectectures.展开更多
文摘由于SSOR预条件共轭梯度算法中预条件方程求解需要前推和回代,导致算法迁移到GPU平台上并行效率不高.为此,基于诺依曼多项式分解技术,提出了一种GPU加速的SSOR稀疏近似逆预条件子(GSSORSAI).它不仅保持了原线性系统系数矩阵的稀疏和对称正定特性,而且预条件方程求解仅需一次稀疏矩阵矢量乘运算,避免了前推和回代过程.实验结果表明:在NVIDIA Tesla C2050GPU上,对比使用Python在单个CPU上SSOR稀疏近似逆预条件子实现方法,GSSORSAI平均快将近100倍;应用到并行的PCG算法中,相比无预条件的CG算法,平均提高了算法的3倍的收敛速度.
文摘This paper presents a new CG-type algorithm for solving large linear systems. It isobtained from a subclass of the ABS algorithm-Voyevodin’s CG method by choosingthe parameter matrix B in some special ways. Having preconditioning properties, thematrix B makes the new algorithm converge fast. The convergence analysis is given.Several ways for choosing B, which are similar to the polynomial preconditioing, arediscussed. Numerical tests indicate that the new algorithm is effective and competitive.Besides, it is suitable for parallel architectectures.
基金Acknowledgment: This work is supported by Fujian Province Natural Science Foundation (No. 2008J0180) and Scientific Research Start Foundation of Fujian University of Technology (No. GY-Z0707).