异构系统结构力学计算GPU加速性能分析与应用

Analysis and application of structural mechanical simulation accelerated by GPU with heterogeneous architecture

下载PDF

导出

摘要高性能计算领域利用GPU加速计算已逐渐发展成为主流应用的普遍功能之一。主流结构力学应用ABAQUS支持GPU通用计算,充分发挥GPU的高性能浮点运算能力与访存带宽,提高软件求解效率。介绍了ABAQUS软件对GPU加速应用的发展历程,在上海超级计算中心三种不同的异构环境下,针对实际算例的GPU加速性能进行了测试,建立计算效率分析方法,分析GPU加速对求解时间、系统资源调用、软件并行效率的影响,并针对采用隐式方法求解百万量级自由度问题时资源的合理使用提出建议。 General computation accelerated by Graphic Processing Unit （GPU） has become popular in today＇s High Performance Computing （HPC） area. As one of the leading software solving structural mechanical problems, ABAQUS now offers general purpose GPU （GPGPU） simulation module. Benefited from high floating point computational ability and memory bandwidth, GPU is able to accelerate Computer Aided Engineering （CAE） simulation in ABAQUS. History of GPGPU simulation was firstly introduced, followed by software test regarding to engineering cases with three different heterogeneous hardware systems provided in Shanghai Supereomputer Center. The influence on solver time, system resource usage, and parallel efficiency by GPU acceleration was addressed, while suggestions on solving multi-million degree-of-freedom cases by implicit solver were made.

作者郭培卿陈小龙

机构地区上海超级计算中心高性能计算应用技术部

出处《计算机应用》 CSCD 北大核心 2014年第A01期78-81,共4页 journal of Computer Applications

基金国家863计划项目(2012AA01A308)

关键词计算机辅助工程结构力学 ABAQUS 图形处理器加速高性能计算 Computer-Aided Engineering （CAE） structural dynamics ABAQUS GPU acceleration High PerformanceComputing （HPC）

分类号 TP301 [自动化与计算机技术—计算机系统结构]

引文网络
相关文献

参考文献12

1吴恩华.图形处理器用于通用计算的技术、现状及其挑战[J].软件学报,2004,15(10):1493-1504. 被引量：141
2GEOGESCU S, CHOW P. GPU accelerated CAE using open solvers and the cloud [ J]. ACM SIGARCH Computer Architecture News, 2011,39(4) : 14 - 19.
3THIBAULT J C, SENOCAK I. Accelerating incompressible flow computations with a pthreads - CUDA implementation on small-foot- print multi-GPU platforms [ J]. The Journal of Supercomputing, 2012,59(2) :693 -719.
4徐新海,林宇斐,易伟.CPU-GPGPU异构体系结构相关技术综述[J].计算机工程与科学,2009,31(A01):24-26. 被引量：10
5BELL N, GARLAND M. Implementing a sparse matrix-vector multi- plication on throughput oriented processors [ C]// SC'09: Proceed- ings of the Conference on High Performance Computing Networking, Storage and Analysis. New York: ACM, 2009: 18.
6GLASKOWSKY P N. NVIDIAs Fermi: The first complete GPU computing architecture[ R]. [ S. l. ] : NVIDIA Corporation, 2009:1 -26.
7CHOI J W, SINGH A, VUDUC R W. Model driven autotuning of sparse matrix-vector multiply on GPUs [ C]// PPoPP 2010: Pro- ceedings of the 15th ACM SIGPLAN Sympesium on Principles and Practice of Parallel Programming. New York: ACM, 2010: 115- 126.
8DIMITROV M, MANTOR M, ZHOU H. Understanding software ap- proaches for GPGPU reliability [ C]//GPGPU-2: Proceedings of the 2nd Workshop on General Purpose Processing on Graphics Process- ing Units. New York: ACM, 2009: 94-104.
9LUCAS R, WAGENBRETH G, DAVIS D. Implementing a GPU- enhanced cluster fur large scale simulations [ C]// I/ITSC 2007: The Interservice/Industry Training, Simulation & Education Confer- ence. [S. l.]: National Defense Industrial Association, 2007: 7437.
10GUNEY M E. High-performance direct solution of finite-element problems on multi-core processors [ D]. Atlanta: Georgia Institute of Technology, 2010.

二级参考文献12

1吴恩华,柳有权.基于图形处理器(GPU)的通用计算[J].计算机辅助设计与图形学学报,2004,16(5):601-612. 被引量：226
2Luebke D, Harris M, Krger J,et al. GPGPU: General Purpose Computation on Graphics Hardware[C]//Proc of ACM SIGGRAPH '04,2004.
3AMD CorporatiorL ATI Stream Computing User Guide 1.4. 0a[EB/OL]. [2009-05-071. http://developer, amd. com/gpu _assets/Stream_Computing_User_Guide. pdf.
4Buck I. Brook Spec v0. 2[R]. Technical Report, Stanford University, 2003.
5NVIDIA Corporation. NVIDIA CUDA Compute Unified Device Architecture Programming Guide [EB/OL]. [2007-06- 23]. developer, download, nvidia, com/compute/cuda/1 0/ NV1DIA_CUDA_Programming_Guide_1.0. pdf.
6Lee S, Min S J, Eigenmann R. Openmp to GPGPU: a Compiler Framework for Automatic Translation and Optimization [C]//Proc of the 14th ACM SIGPLAN Syrnp on Principles and Practice of Parallel Programming, 2008:101-110.
7Han T D, Abdelrahman T S. hiCUDA: a High-level Directive-Based Language for GPU Programming[C]//Proc of the 2nd Workshop on General Purpose Processing on Graphics Processing Units, 2009 : 52-61.
8Wang N, Patel S. ReStore: Symptom Based Soft Error Detection in Microprocessors[C]//Proc of DSN, 2005.
9George N, Laeh J, Gurumurthi S. Towards Transient Fault Tolerance for Heterogeneous Computing Platforms [C]//Workshop on Compiler and Architectural Techniques for Application Reliability and Security, 2008.
10Dimitrov M, Mantor M, Zhou H. Understanding Software Approaches for GPGPU Reliability [C]//Proc of the 2nd Workshop on General Purpose Processing on Graphics Processing Units, 2009:94-104.

共引文献149

1刘波,王博亮,谢杰镇.应用于生物膜组织的虚拟手术仿真技术研究[J].中国数字医学,2007,2(11):37-40. 被引量：1
2张军,易成,王邦平,李晓峰.GPU加速的鲁棒性人脸2.5D重建方法[J].四川大学学报（工程科学版）,2009,41(4):155-162.
3刘伟峰,赵改善,孔祥宁,蔡杰雄,张兵.基于多GPU的三维Kirchhoff积分法体偏移[J].华中科技大学学报（自然科学版）,2011,39(S1):110-114.
4刘伟峰,王永胜,张天雷,张兵.使用GPU模拟地震波传播的性能研究[J].系统仿真学报,2009,21(S1):170-174. 被引量：3
5鲍春波,王博亮.基于半边结构的膜组织触觉仿真[J].学术问题研究,2006,0(2):104-109.
6张建勋,刘全利,陈庄.基于可编程GPU的快速体绘制技术[J].重庆大学学报（自然科学版）,2005,28(7):67-70. 被引量：9
7柳有权,刘学慧,吴恩华.基于GPU带有复杂边界的三维实时流体模拟[J].软件学报,2006,17(3):568-576. 被引量：54
8方建文,于金辉,马文龙.图形硬件加速的实时水面绘制[J].计算机工程与应用,2006,42(15):86-88. 被引量：2
9李笑盈,吴恩华.过程性纹理映射的FPGA动态生成[J].计算机辅助设计与图形学学报,2006,18(5):630-637. 被引量：1
10李建明,万单领,迟忠先,胡祥培.一种基于GPU加速的细粒度并行粒子群算法[J].哈尔滨工业大学学报,2006,38(12):2162-2166. 被引量：8

1王丽芳.基于CUDA的图形处理器加速锥束CT重建算法的研究[J].计算机应用与软件,2014,31(1):218-221.
2毛耀宗,陈珂,江弋,邹权.基于粒子群算法与图形处理器加速的支持向量机参数优化方法[J].厦门大学学报（自然科学版）,2013,52(5):609-612. 被引量：5
3汪珊珊.量子力学的新应用——量子计算机[J].科协论坛（下半月）,2010(2):106-106. 被引量：1
4赵小敏,周波,刘春媛,陶金.GPU加速的傅里叶变换轮廓术并行计算方法[J].机械制造与自动化,2013,42(2):141-144. 被引量：1
5姜琼,孔明明,林海.基于计算机图形处理器加速的体绘制[J].系统仿真学报,2009,21(8):2285-2287. 被引量：2
6田幂,胡亮,车喜龙.Fermi架构下的SPSO算法加速[J].吉林大学学报（理学版）,2013,51(4):647-652. 被引量：1
7马利民.基于OpenFlow的SDN控制器关键技术研究[J].通讯世界（下半月）,2016(3):9-9. 被引量：2
8刘丽.请个助手监控网络[J].网管员世界,2010(12):78-79.
9吕晓琪,张传亭,侯贺,张宝华.基于图形处理器加速光线投射算法的多功能体绘制技术[J].计算机应用,2014,34(1):135-138. 被引量：2
10查珊珊,王远军,聂生东.基于图形处理器加速的医学图像配准技术进展[J].计算机应用,2015,35(9):2486-2491. 被引量：4

计算机应用

2014年第A01期

浏览历史

内容加载中请稍等...

异构系统结构力学计算GPU加速性能分析与应用

参考文献12

二级参考文献12

共引文献149

相关作者

相关机构

相关主题

浏览历史