期刊文献+

基于GPU的隐式算法与方案研究

The Research of the Implicit Algorithm and Program Based on the GPU
原文传递
导出
摘要 图形处理单元(GPU)可以将桌面计算机的计算速度提高1~2个数量级,发展相关的隐式算法非常重要。本研究根据GPU的硬件特点,选择了DP—LUR隐式方法,并对此进行了进一步的改进。根据GPU算法低内存需求,首先对DP—LUR方法右端项相关矩阵运算进行改写,将其变化为具有推广价值的标量形式。这一改进与原始方法完全等价,但形式极为简洁,节省了大量的内存存储与读写需求。随后,进一步将左端项矩阵对角化,从而对内存存储与读写的需求进一步降低,同时降低了单步迭代计算量,但也同时降低了收敛速度,总计算量比前一种方法增加了约20%。以上两种改进相互独立,可以根据需要单独或联合选取。 The computation speed of desktop computer can be accelerated 10~100 times by Graphic Processing Unit (GPU). Therefore, it is very important to develop corresponding implicit algorithm. According to the hardware characteristic of GPU, the DP-LUR implicit method is chosen and improved. According to the requirement of low memory, the matrix operation of the right term is rewritten as scalar form which can be extended. This improvement does not change the nature of origin method, but has very concise form and save many memory storage, read and written requirement. Further, the left term is changed diagonally. It decreases the memory storage, read and written requirement further. It also decreases the computation time for the one iteration, however, the totally computation time increases about 20%, because the convergence speed is decreased simultaneously. Above two improvements are independent of each other, and can be individually or jointly adopted according to requirement.
出处 《工程热物理学报》 EI CAS CSCD 北大核心 2013年第11期2043-2047,共5页 Journal of Engineering Thermophysics
基金 国家自然科学基金资助项目(No.51276092)
关键词 GPU 隐式算法 DP—LUR方法 低内存需求算法 GPU implicit algorithm DP-LUR method algorithm of low-memory requirement
  • 相关文献

参考文献3

二级参考文献51

  • 1柳有权,刘学慧,吴恩华.基于GPU带有复杂边界的三维实时流体模拟[J].软件学报,2006,17(3):568-576. 被引量:54
  • 2Bergman C M, Vos J B. Parallelization of CFD codes[J]. Computer Methods in Applied Mechanics and Engineering, 1991, 89(1- 3): 523- 528.
  • 3Buck I, Foley T, Horn D, et al. Brook for GPUs: stream computing on graphics hardware [ C] // ACM SigGraph 2004 Papers. International Conference on Computer Graphics and Interactive Techniques. New York: ACM, 2004: 777- 786.
  • 4Kruger J, Westermann R. Linear algebra operators for GPU implementation of numerical algorithms[C]//ACM SigGraph 2005 Courses. International Conference on Coin purer Graphics and Interactive Techniques. New York: ACM, 2005:908 -916.
  • 5Rumpf M, Strzodka R. Nonlinear diffusion in graphics hardware[C]//Ebert D, Favre J M, Peikert R. Data Visualization 2001. New York, Springer, 2001:75- 84.
  • 6Bolz J, Farmer I, Grinspun E, et al. Sparse matrix solvers on the GPU: coniugate gradients and multigrid[C]// ACM SigGraph 2003 Papers. International Conference on Computer Graphics and Interactive Techniques. New York: ACM, 2003:917- 924.
  • 7Goodnight N, Woolley C, Lewin G, et al. A multigrid solver for boundary value problems using programmable graphics hardware [C]//SigGraph/Eurographics Conference on Graphics Hardware. Proceedings of the ACM SigGraph/Eurographics Conference on Graphics Hardware. Airela- Ville, Switzerland: Eurographics Association, 2003:102- 111.
  • 8Fatica M, Jameson A, Alonso J. SFLO: an Euler solver for streaming architectures[R]. AIAA- 2004-1090, 2004.
  • 9Brandvik T, Pullan G. Acceleration of a two dimensional Euler flow solver using commodity graphics hardware[J]. Journal of Mechanical Engineering Science, 2007, 221 (12): 1745-1748.
  • 10Brandvik T, Pullan G. Acceleration of a 3D Euler solver using commodity graphics hardware[R]. AIAA- 2008-607, 2008.

共引文献21

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部