期刊文献+

基于GPU的SSOR稀疏近似逆预条件研究 被引量:2

Research of the SSOR sparse approximate inverse preconditioner on GPUs
下载PDF
导出
摘要 由于SSOR预条件共轭梯度算法中预条件方程求解需要前推和回代,导致算法迁移到GPU平台上并行效率不高.为此,基于诺依曼多项式分解技术,提出了一种GPU加速的SSOR稀疏近似逆预条件子(GSSORSAI).它不仅保持了原线性系统系数矩阵的稀疏和对称正定特性,而且预条件方程求解仅需一次稀疏矩阵矢量乘运算,避免了前推和回代过程.实验结果表明:在NVIDIA Tesla C2050GPU上,对比使用Python在单个CPU上SSOR稀疏近似逆预条件子实现方法,GSSORSAI平均快将近100倍;应用到并行的PCG算法中,相比无预条件的CG算法,平均提高了算法的3倍的收敛速度. For the SSOR preconditioned conjugate gradient algorithm,the preconditioner equation solving needs the forward/backward substitutions,which greatly prevents parallelizing SSOR PCG algorithms on the GPU platform due to their strong serial processing.Thus,based on the Neumann series approximation, a GPU accelerated SSOR sparse approximate inverse preconditioner is proposed.For GSSORSAI,it preserves the sparse and symmetric positive characteristics of the original coefficient matrix in the linear system,and the preconditioner equation solving only needs a sparse matrix-vector multiplication operation,which avoids the forward/backward substitutions.Experiments results show on the NVIDIA Tesla C2050 GPU,GSSORSAI is generated on average 100 times faster than the implementation by Python on single CPU.Compared to the convergence of the CG algorithm,the PCG algorithm with GSSORSAI has on average 3times faster convergent rate.
出处 《浙江工业大学学报》 CAS 北大核心 2016年第2期140-145,共6页 Journal of Zhejiang University of Technology
基金 国家自然科学基金资助项目(61379017)
关键词 SSOR预条件子 预条件共轭梯度算法 稀疏近似逆 GPU SSOR preconditioner preconditioned conjugate gradient algorithm sparse approximate inverse GPU
  • 相关文献

参考文献12

  • 1GOLUB G H, VAN LOAN C F. Matrix eomputations[M]. Baltimore: The John Hopkins University Press, 1989.
  • 2王厂文,张有正.正定矩阵和的行列式不等式[J].浙江工业大学学报,2006,34(3):351-354. 被引量:1
  • 3KAASSCH1ETER E F. Preconditioned conjugate gradients for solving singular systems[J]. CAM journal, 1988 (24) : 265- 275.
  • 4龙爱芳.避免二阶导数计算的迭代法[J].浙江工业大学学报,2005,33(5):602-604. 被引量:2
  • 5AMENT M, KNITTEL G, WEISKOPF D. A parallel precon- ditioned conjugate gradient solver for the Poisson problem on a multi-GPU[J]. Parallel distributed and network-based compu- ting,2010(6) :583-592.
  • 6SAAD Y. Iterative Methods for Sparse Linear Systems[M]. Philadelphia: SIAM,2003.
  • 7CHOW E, SAAD Y. Approximate inverse preconditioners via sparse-sparse iterations [ J ]. SIAM journal, 1998 ( 19 ) : 995- 1023.
  • 8GORTE M J, HUCKLE T. Parallel preconditioning with sparse approximate inverses [J]. SIAM journal, 1997 (18) : 838-853.
  • 9BELL N, GARLAND M. Efficient sparse matrix-vector multi- plication on CUDA[R]. Santa Clara: NVIDIA,2008.
  • 10马超,韦刚,裴颂文,吴百锋.GPU上稀疏矩阵与矢量乘积运算的一种改进[J].计算机系统应用,2010,19(5):116-120. 被引量:2

二级参考文献25

  • 1盛兴平.广义实正定矩阵的几个不等式[J].大学数学,2004,20(4):105-107. 被引量:1
  • 2丁卫平.关于正定矩阵一不等式的简单证明[J].大学数学,2004,20(6):109-110. 被引量:2
  • 3熊斌.Schur不等式和Hlder不等式及其应用[J].数学通讯(教师阅读),2005,19(8):41-44. 被引量:8
  • 4李庆扬.数值分析[M].武汉:华中工学院出版社,1986.216-217.
  • 5Open Computing Language (OpenCL).[2009-08-05]http:www.khronos,org/opencl/.
  • 6Bell N,Garland M.Efficient sparse matrix-vector multiplication on CUDA.NVIDIA Technical Report NVR-2008-004,December 2008.
  • 7Muthu Manikandan Baskaran,Rajesh Bordawekar.Optimizing Sparse Matrix-Vector Multiplication on GPUs,IBM Technical Report RC24704.2008.
  • 8Harris M.High Performance Computing with CUDA-Optimizing CUDA,Super-computing Tutorials (2007)[2009-08-05].http://gpgpu.org/sc2007.
  • 9Sengupta S,Harris M,Zhang Y,Owens JD.Scan primitives for gpu computing.GH'07:Proc.of the 22nd ACM SIGGRAPH/EUROGRAPHICS Symposium on Graphics hardware,2007.97-106.
  • 10Davis T.The University of Florida Sparse Matrix Collection.[2009-08-05]http://www.cise.ufl.edu/research/ sparse/matrices/.

共引文献7

同被引文献10

引证文献2

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部