摘要
在科学计算和工程领域,大型稀疏线性方程组的求解非常常见,目前已经有许多迭代方法和预处理技术被用于求解这类方程。DILU预处理技术类似于ILU,是开源计算流体力学软件OpenFOAM中重要的预处理技术,但未在OpenFOAM以外的领域引起关注,目前也没有完整的GPU实现。比较了DILU和ILU预处理技术对稳定双共轭梯度法(BiCGStab)加速的效果,以及它们在构造预处理子上的开销,结果表明,DILU在加速效果上不逊于ILU且在稳定性上优于ILU。在GPU并行实现方面,DILU可以使用分层并行和无全局同步并行两种并行策略,详细讨论了DILU预处理技术在这两种策略下的实现方法,给出了相关的算法和参考代码,然后比较了在两种并行策略下DILU预处理技术的性能。数值实验结果表明,在实践中两种并行策略各有优劣,可以根据实际表现进行选择。另外比较了GPU和CPU执行的DILU预处理技术,GPU在性能上具有明显优势,在线性方程组求解上存在性能瓶颈的程序可以移植到GPU平台以提升性能。
Large sparse linear equations often appear in scientific computation and engineering.There are many iterative methods and preconditioning techniques for solving these linear equations.Diagonal-based incomplete LU(DILU)is a preconditioning technique similar to incomplete LU(ILU)factorization.DILU is applied in OpenFOAM,an open source computational fluid dynamics software,and is a very important preconditioning technique in OpenFOAM.DILU has not received extensive attention outside OpenFOAM,and there is no complete GPU-based implementation so far.This paper compares DILU preconditioned BiCGStab with ILU preconditioned BiCGStab,and the time elapses in preconditioner constructions.The numeric experiments suggest that DILU may be more efficient and stable than ILU.As for GPU-based parallel implementations,this paper discusses two parallel schemes,that are level-set scheme and synchronization-free scheme,and gives related algorithms and some codes under these two parallel schemes.It compares the performances of DILU preconditioning technique under two parallel schemes.The numeric results show that each scheme has its own advantages and disadvantages in different equations,and we can select one according to their performances in practice.This paper compares the performance of DILU preconditioning on GPU and CPU,and the results show that GPU is more competitive.The applications that have performance bottlenecks on linear systems solutions can be improved by moving to GPU platforms.
作者
汪晋
刘江
WANG Jin;LIU Jiang(Chongqing Institute of Green and Intelligent Technology,Chinese Academy of Sciences,Chongqing 400714,China;School of Computer Science and Technology,University of Chinese Academy of Sciences,Beijing 100049,China)
出处
《计算机科学》
CSCD
北大核心
2022年第6期108-118,共11页
Computer Science
基金
国家自然科学基金(61672488)
国家重点研究发展计划(2018YFC0116704)。