摘要
在求解线性系统时,迭代法是一种基本的方法,特别是在系数矩阵为大规模稀疏矩阵的情况下,高效地使用迭代法求解变得十分重要。本文通过分析迭代法的一般特点,提出了使用具有强大计算能力和存储带宽的GPU加速迭代法的一般方法。利用这些方法,在两种主流GPU平台上实现了一个经典的迭代法PQMRCGSTAB,并且针对不同的GPU平台特点提出了具体的优化方法。与AMD Opteron 2.4GHz 4核处理器相比,双精度版本的PQMRCGSTAB算法经NVIDIA Tesla S1070加速后性能提高31倍,经AMD Radeon HD 4870 X2加速后性能提高9倍。
Iterative method is a basic solution as solvers of the linear systems. Especially given its role in the systems with large scale sparse coefficient matrix, iterative method is of singular importance. In this paper, after analyzing the common characters of iterative methods, we present a few general approaches to accelerate iterative methods on GPUs with high computability and memory bandwidth. With our approaches, a classic iterative method named PQMRCGSTAB is accommodated to two popular GPU platforms, and we also introduce optimizations basing on the different architectures. Compared with a program running parallel on four cores of an AMD Opteron(tm) qua&core processor 2380, PQMRCGSTAB algorithm on a NVIDIA Tesla S1070 platform achieves 31x higher speed; while on an AMD Radeon HD 4870 X2 platform, the algorithm obtains 9x higher speed.
出处
《计算机工程与科学》
CSCD
北大核心
2009年第A01期179-182,共4页
Computer Engineering & Science
基金
国家863计划资助项目(2008AA01Z110)
国家自然科学基金资助项目(60673150
60903044)