期刊文献+

线性系统求解中迭代算法的GPU加速方法 被引量:4

Accelerating Iterative Methods in Solving Linear Systems on GPUs
下载PDF
导出
摘要 在求解线性系统时,迭代法是一种基本的方法,特别是在系数矩阵为大规模稀疏矩阵的情况下,高效地使用迭代法求解变得十分重要。本文通过分析迭代法的一般特点,提出了使用具有强大计算能力和存储带宽的GPU加速迭代法的一般方法。利用这些方法,在两种主流GPU平台上实现了一个经典的迭代法PQMRCGSTAB,并且针对不同的GPU平台特点提出了具体的优化方法。与AMD Opteron 2.4GHz 4核处理器相比,双精度版本的PQMRCGSTAB算法经NVIDIA Tesla S1070加速后性能提高31倍,经AMD Radeon HD 4870 X2加速后性能提高9倍。 Iterative method is a basic solution as solvers of the linear systems. Especially given its role in the systems with large scale sparse coefficient matrix, iterative method is of singular importance. In this paper, after analyzing the common characters of iterative methods, we present a few general approaches to accelerate iterative methods on GPUs with high computability and memory bandwidth. With our approaches, a classic iterative method named PQMRCGSTAB is accommodated to two popular GPU platforms, and we also introduce optimizations basing on the different architectures. Compared with a program running parallel on four cores of an AMD Opteron(tm) qua&core processor 2380, PQMRCGSTAB algorithm on a NVIDIA Tesla S1070 platform achieves 31x higher speed; while on an AMD Radeon HD 4870 X2 platform, the algorithm obtains 9x higher speed.
出处 《计算机工程与科学》 CSCD 北大核心 2009年第A01期179-182,共4页 Computer Engineering & Science
基金 国家863计划资助项目(2008AA01Z110) 国家自然科学基金资助项目(60673150 60903044)
关键词 GPU 迭代法 加速 PQMRCGSTAB算法 GPU iterative method acceleration PQMRCGSTAB algorithm
  • 相关文献

参考文献15

  • 1Barrachina S, Castillo M, Igual F D,et al. Solving Dense Linear Systems on Graphics Proeessors[C]//Proc of Euro-Par, 2008 : 739-748.
  • 2Liu W, Schmidt B, Voss G, et al. Molecular Dynamics Simulations on Commodity GPUs with CUDAEC] ffProc of Int'l Conf on High Performance Computing, 2007 : 185-196.
  • 3Scheuermann T, Hensley J. Efficient Histogram Generation Using Scattering on GPUs[C]//Proc of the 2007 Symp on Interactive 3D Graphics and Games, 2007:33-37.
  • 4Yao Z H, Yuan M W. Computational Methods in Engineering &Science[C]//Proc of Enhancement and Promotion of Computational Methods in Engineering and Science X, 2006: 21- 23.
  • 5NVIDIA. CUDA Programming Guide 2. 1[EB/OL]. [-2009- 05-06]. http: /// developer, download, nvidia, com/compute/ cuda/2 _ 1/toolkit/do-cs/NVIDIA _ CUDA _ Programming _ Guide 2. 1. pdf.
  • 6AMD Steam[EB/OL]. [2009-04-10]. http://www, amd. com/stream.
  • 7Buatois L, Caumon G, Levy B. Concurrent Number Cruncher: An Efficient Sparse Linear Solver on the GPU[C]//Proc of the High-Performance Computation Conf,2007.
  • 8Bell N, Garland M. Efficient Sparse Matrix-Vector Multiplication on CUDA[R]. NVIDIA Technical Report NVR-2008- 004,2008.
  • 9Iterative Method [ EB/OL]. [ 2009-04-11 ]. http:// baike. baidu, com/view/649495, htm.
  • 10Compressed Row Storage[EB/OL]. [2009-03-15]. http:/// www. es. utk. edu/-dongarra/etemplates/node373, html.

同被引文献24

  • 1吴恩华.图形处理器用于通用计算的技术、现状及其挑战[J].软件学报,2004,15(10):1493-1504. 被引量:141
  • 2孙济洲,樊莉亚,孙敏,于策,张绍敏.改进的并行高斯全主元消去法[J].天津大学学报,2006,39(9):1115-1119. 被引量:7
  • 3Bauder G M.Asynchronous iterative methods for multiprocessors[J]. J of the ACM, 1978,25(2) :226-244.
  • 4Quintana-Orti G, Igual F D, Quintana-Orti E S, et al.Solving dense linear systems on platforms with multiple hardware aceel- erators[J].ACM SIGPLAN Notices, 2009,4(44) : 121-130.
  • 5Kirk D B, Wen-mei W H.Programming massively parallel pro- cessors: a hands-on approach[M].San Francisco, CA, USA: Mor- gan Kaufmann Publishers Inc,2010.
  • 6Duff I S. Vorst H A. van der. Developments and Trends in the Parallel Solution of Linear Systems[J]. Parallel Computing. 1999. 25(13/14): 1931-1970.
  • 7Bell A. Haverkort B R. Serial and Parallel Out-of-Core Solution of Linear Systems Arising from Generalised Stochastic Petri Nets[C]/ /Proc High Performance Computing 2001. Seattle:[so n.J. 2001: 242-247.
  • 8Mehmood R. Out-of-Core and Parallel Iterative Solutions for Large Markov Chains[R]. Edgbaston , University of Birmingham. 2001.
  • 9Engel W. GPU Pro 3: Advanced Rendering Techniques[M]. Natick. MA: CRC Press. 2012: 70-75.
  • 10BolzJ. Farmer 1, Grinspun E. et al. Sparse Matrix Solvers on the GPU: Conjugate Gradients and Multigrid[J]. ACM Transactions on Graphics. 2003. 22(3): 917-924.

引证文献4

二级引证文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部