期刊文献+

基于GPU的矩阵求逆性能测试和分析 被引量:10

Performance Testing and Analysis for Matrix Inversion Base on GPU
下载PDF
导出
摘要 在CPU串行运算模式下实现大规模矩阵求逆是一个非常耗时的过程。为了解决这一问题,基于NVIDIA公司专为GPU(图形处理器)提供的CUDA(计算统一设备架构),从新的编程角度出发,利用GPU多线程并行处理技术,将矩阵求逆过程中大量的数据实现并行运算,从而获得了较大的加速比。同时,根据程序的执行结果,分析了GPU的单精度与双精度的浮点运算能力及其优、劣势。最后,通过分析数据传输时间对GPU性能的影响,总结出适合GPU的算法特征。 For the CPU serial operation mode,it is a very time-consuming process to obtain the inverse of large-scale matrix.Aiming at the above shortcoming,this paper proposes a new programming method based on the common platform CUDA for GPU designed by NVIDIA.By using the multi-threaded parallel processing technology of GPU,a large scale of data during solving the inverse matrix are parallelly computed such that a higher speedup may be obtained.Moreover,both the single-precision and the double-precision FLOPS of GPU are analyzed according to the results of this program.Finally,some characteristics of the proposed algorithms are summarized by analyzing the effect of the data transmission time on the performance of GPU.
出处 《华东理工大学学报(自然科学版)》 CAS CSCD 北大核心 2010年第6期812-817,共6页 Journal of East China University of Science and Technology
基金 国家"973"计划基金项目(2009CB918501) 国家自然科学基金项目(20803022)
关键词 图形处理器(GPU) 计算统一设备架构(CUDA) CPU 并行运算 矩阵求逆 GPU CUDA CPU parallel computation matrix inversion
  • 相关文献

参考文献19

  • 1Bianchi L, Gatti R, Lombardi L. The future of parallel computing: GPU vs CELL: General purpose planning against fast graphical computation architectures, which is the best solution for general purposes computation? [C]// Proceedings of the Third International Conference on Computer Graphics Theory and Applications, GRAPP. Madeira, Portugal: [s. n.], 2008: 419-425.
  • 2Myungho L, Chin H C, Sugwon H. Financial derivatives modeling using GPU's [C]// Proceedings of the 2009 Interna- tional Conference on Scalable Computing and Communications. Dalian, China: IEEE Computer Society, 2009: 440- 445.
  • 3Preis T, Virnau P, Paul W, analysis by graphic cards and et al. Accelerated fluctuation complex pattern formation in financial markets [J]. New Journal of Physics, 2009, 11(9): 093024.
  • 4Zhao Ye. Lattice Boltzmann based PDE solver on the GPU [J]. Visual Compute, 2007, 24: 323-333.
  • 5Manavski S, Valle G. CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment[J]. BMC Bioinformatics, 2008, 9: 1344-1365.
  • 6Ufimtsev I S, Martmez T J. Quantum chemistry on graphical processing units [J]. Journal of Chemical Theory and Computation, 2008, 4(2): 222-231.
  • 7Owens J D, Houston M, Luebke D, et al. GPU computing [J]. Proceedings of the IEEE, 2008, 96(5): 879-899.
  • 8Manuel U, UmitV C. High-performance signal processing on emerging many-core architectures using CUDA [C] // International Conference on Multimedia and Expo. USA: IEEE, 2009: 1825-1828.
  • 9Macioof P, Banas K. Testing tesla architecture for scientific computing: The performance of matrix-vector product [C]// International Multiconference on Computer Science and Information Technology. USA: IEEE,2008: 263-269.
  • 10Kruger J, Westermann R. Linear algebra operators for GPU implementation of numerical algorithms [C]//International Conference on Computer Graphics and Interactive Techniques, ACM Transactions on Graphics. New York, USA: ACM Press, 2005: 6-9.

同被引文献99

引证文献10

二级引证文献25

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部