摘要
在Gauss-Jordan消去法的基础上,给出了一种适应于CUDA架构的改进Gauss-Jordan消去并行算法。通过分析该方法的处理过程以及CUDA架构的相应限制,在CUDA的grid-block-thread三层组织结构的基础上,从算法构造的角度提出了grid-strip-group-block-thread五层结构,给出了基础行以及全局基础行等概念,并构建了适应于CUDA架构的Gauss-Jordan消去法的并行版本,在最高维数为4000维的大规模稠密线性方程组的算例求解上与串行Gauss-Jordan消去法进行了比较,实验结果表明,该算法能够充分利用GPU的硬件特性,有效地降低了大规模稠密线性方程组的求解时间。
A parallel improved version of the Gauss-Jordan elimination algorithm for solving large-scale dense linear system on CUDA is proposed in this paper.After analyzing the procedure of Gauss-Jordan elimination algorithm and the constraints of CUDA, it gives a new logical organization of "grid-strip-group-block-thread" and the concepts of "based line" and "global based line" ,based on which the parallel version of the Gauss-Jordan elimination algorithm on CUDA is proposed.The numerical experiment of test instances with max size 4 000 shows that the algorithm can utilize the advantage of the GPU and decrease the computational time for the large-scale dense linear system effectively.
出处
《计算机工程与应用》
CSCD
北大核心
2011年第32期27-30,共4页
Computer Engineering and Applications
基金
哈尔滨市科技创新人才研究专项资金(No.2008RFQXG054)