期刊文献+

基于GPU的并行Cholesky分解及其应用 被引量:1

Parallel Cholesky Decomposition and Its Application Based on GPU
下载PDF
导出
摘要 在OpenCL并行计算框架的clMAGMA库中,Cholesky分解算法采用大尺寸分块并行方法,不能充分利用GPU的高速局部存储器,且在计算过程中存在多次GPU-CPU间的数据传递。为此,提出采用小尺寸分块并行方法,充分利用GPU中的高速局部存储器,使矩阵子块的逆矩阵得到复用,完成对称正定矩阵的高效Cholesky分解,并且其能够应用于三维视觉光束平差问题中的大型正定矩阵的分解。实验结果表明,该方法的Cholesky分解速度比clMAGMA提升50%以上,针对光束平差问题,比Ceres Solver中使用的Eigen库速度提升约38倍。 In the clMAGMA library of OpenCL parallel computing framework,the large size block parallel method is used in the Cholesky decomposition algorithm,which can not make full use of the high speed local memory of GPU,and there are many data transfers between GPU-CPU in the calculation process.To solve this problem,a small size block parallel method is proposed.By making full use of the high speed local memory in GPU,the inverse matrix of matrixsubblock is multiplexed,and the efficient Cholesky decomposition of symmetric positive definite matrix is completed,and it can be applied to the decomposition of large positive definite matrix in the problem of three-dimensional vision bundle adjustment.Experimental results show that the speed of Cholesky decomposition is more than 50 % higher than that of clMAGMA,and in bundle adjustment problem,the speed is 38 times faster than the Eigen library used in Ceres Solver.
作者 沈雁 戴瑜兴 SHEN Yan;DAI Yuxing(College of Electrical and Information Engineering,Hunan University,Changsha 410082,China;College of Mathematics,Physics and Electronic Information Engineering,Wenzhou University,Wenzhou,Zhejiang 325035,China)
出处 《计算机工程》 CAS CSCD 北大核心 2019年第2期284-289,共6页 Computer Engineering
基金 浙江省自然科学基金重点项目(LZ16E050002)
关键词 正定系统 CHOLESKY分解 并行计算 OpenCL框架 光束平差 positive definite system Cholesky decomposition parallel computing OpenCL framework bundle adjustment
  • 相关文献

参考文献3

二级参考文献28

  • 1http://glaros.dtc.umn.edu/gkhome/views/metis.
  • 2http://www.cise.ufl.edu/research/sparse/matrices.
  • 3Liu J W H.The role of elimination trees in sparse factorization.SIAM Journal on Matrix Analysis and Applications,1990,11(1):134-172.
  • 4Gustavson F G.Two fast algorithms for sparse matrices:Multiplication and permuted transposition.ACM Transactions on Mathematical Software,1978,4(3):250-269.
  • 5Pissanetsky S.Sparse Matrix Technology.New York:Academic Press,1984.
  • 6Lawson C L,Hanson R J,et al.Basic linear algebra subprograms for Fortran usage.ACM Transactions on Mathematical Software,1979,5(3):308-323.
  • 7Liu J W H,Ng E G,Peyton B W.On finding supernodes for sparse matrix computations.SIAM Journal on Matrix Analysis and Applications,1993,14(1):242-252.
  • 8Li X S,Demmel J W,et al.A supernodal approach to sparse partial pivoting.SIAM Journal on Matrix Analysis and Applications,1999,20(3):720-755.
  • 9Duff I S,Reid J K.The multifrontal solution of indefinite sparse symmetric linear.ACM Transactions on Mathematical Software,1983,9(3):302-325.
  • 10Liu J W H.The multifrontal method for sparse matrix solution:Theory and practice.SIAM Review,1992,34(1):82-109.

共引文献10

同被引文献8

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部