期刊文献+

有限元结构分析的层级负载均衡并行计算方法

A hierarchical load balancing parallel computing approach for finite element structural analysis
原文传递
导出
摘要 由于性价比高、计算能力强,多核机群已经成为当今高性能计算的主流工具.然而,多核机群环境下不同的存储机制和通信延迟特点也为高效并行算法的设计带来了挑战.为充分利用多核机群的硬件资源获取最优性能,本文设计了一种有限元结构分析的层级负载均衡并行计算方法.该方法建立在对计算任务的层次性和粒度性充分挖掘的基础上.为与多核机群的硬件拓扑体系结构相适应,本文将计算任务划分为三个层次:节点间并行、片间并行和核间并行.其中,节点间并行和片间并行采用粗粒度并行计算方法,而核间并行采用细粒度并行计算方法.通过将计算任务映射到多核机群的不同硬件层面执行,该方法不仅有效实现了不同层面的负载均衡,而且大幅度降低了系统的通信开销.此外,它还大幅度减少了子区域的数目,有效提高了界面方程的数值收敛性.为验证算法的有效性,在"天河二号"超级计算机上进行了有限元结构线性静力分析大规模并行计算测试.结果表明:同传统区域分解法相比,层级负载均衡并行计算方法能够获得较高的加速比和并行效率.本文的研究主要集中在线性静力学问题上.对于非线性问题或者动力学问题,由于涉及多个迭代步,因此可以将本文算法封装为一个子函数进行调用. Multi-core clusters have become primary tools for high performance computing due to their great computing power and cost-to-performance effectiveness in nowadays. However, it introduces new challenges for the design of efficient parallel algorithms because of the different storage mechanisms and non-uniform communication latencies on these machines. The traditional domain decomposition methods use the direct partition method to achieve load balancing, which directly divides the structure into a number of subdomains with equal according to the number of processing cores involved in parallel computing. As the number of processing cores in a single node of multi-core clusters increases exponentially, the number of subdomains will increase dramatically as well. A substantial increase in the number of subdomains leads to the rapid expansion of the size and the condition number of interface equations, thereby reducing the numerical convergence of the system. In addition, it leads to a considerable increase in the number of processes involved in parallel computing, thereby increasing contention for the limited network ports and bandwidth. The decrease of the numerical convergence and the increase of network communication overheads seriously affect the solution efficiency of interface equations, and greatly reduce the overall parallel efficiency of the domain decomposition method. In order to make full use of the computing power of multi-core clusters to improve the parallel efficiency of large-scale finite element structural analysis, a hierarchical load balancing approach is proposed in the paper. The proposed approach is based on the full mining of computational tasks. In order to adapt to the hardware topology architecture of multi-core clusters, the computational tasks of finite element structural analysis are divided into three layers: inter-node parallelism, inter-chip parallelism and inter-core parallelism. The coarse grain parallel computing method is utilized in inter-node parallelism and inter-chip parallelism, and the fine grain parallel computing method is used in inter-core parallelism. Through mapping computing tasks to different hardware layers of multi-core clusters, the proposed method not only efficiently achieves the load balancing at different layers, but also greatly reduces the communication overheads of the system. Furthermore, it considerably reduces the number of subdomains and significantly improves the numerical convergence of the interface equations. In order to verify the effectiveness of the algorithms, two numerical experiments about finite element structural linear static analysis for large-scale parallel computing were conducted on "Tianhe 2" supercomputer. For each model, both the traditional domain decomposition method and the proposed hierarchical load balancing approach were employed for numerical simulation utilizing 50, 100, 150, and 200 nodes, respectively. Test results show that the proposed method could obtain higher speedup and parallel efficiency compared with the conventional domain decomposition method. The proposed approach can be widely used for solving many kinds of structural analysis problems including linear static analysis, nonlinear static analysis and nonlinear dynamic analysis and so on. In this paper, the authors' current research only focuses on the linear static analysis. For the nonlinear static or dynamic analysis and other kinds of structural analysis, the proposed method can be used as a sub-procedure because the calculations are still dominated by solutions of the same sort of equations.
出处 《科学通报》 EI CAS CSCD 北大核心 2017年第13期1430-1438,共9页 Chinese Science Bulletin
基金 国家高技术研究发展计划(2012AA01A307) 国家自然科学基金(11272214 51475287) 国家重点研发计划(2016YFB0201800)资助
关键词 多核机群 有限元分析 并行计算 负载均衡 multi-core cluster, finite element analysis, parallel computing, load balancing
  • 相关文献

参考文献1

二级参考文献12

  • 1王祥秋,杨林德,周治国.列车振动荷载作用下隧道衬砌结构动力响应特性分析[J].岩石力学与工程学报,2006,25(7):1337-1342. 被引量:75
  • 2Song M K. A new three dimensional finite element a- nalysis model of high-speed train-bridge interactions [J]. Engineering Structure ,200a ,lS(la) : 1611-1626.
  • 3Kwasniewski L, Li H, Wwkezer J, et al. Finiteele- ment analysis of vehicle-bridge interaction[J]. Finite Elements in Analysis and Design, 2006,42 (11) : 950- 959.
  • 4Liu K, Reynders E, De Roeck G, et al. Experimental and numerical analysis of a composite bridge for high- speed trains[J]. Journal of Sound and Vibration, 2009,320(1-2) : 201-220.
  • 5Yang Y B,Asee F, Hung H H. Soil vibrations caused by underground moving trains [J]. Journal of Geotechnical and Geoenvironmental Engineering, 2008,134(11) : 1633-1644.
  • 6Gupta S,Liu W F,Degrande G,et al. Prediction of vi- brations induced by underground railway traffic in Beijing[J]. Journal of Sound and Vibration, 2008, 310(3):608-630.
  • 7Paik S H, Moon J J, Kim S J, et al. Parallel perform- anee of large scale impact simulations on linux cluster super eomputer[J]. Computers and Structures, 2006, 84(10-11) : 732-741.
  • 8Guo Y Z,Jin X L. Parallel computing for seismic re- sponse analysis of immersed tunnel with domain de- composition[J]. Engineering Computations : Interna- tional Journal for Computer-Aided Engineering and Software,2007,24(2) :182-199.
  • 9李政,金先龙,亓文果.流体-结构耦合问题的有限元并行计算研究[J].计算力学学报,2007,24(6):727-732. 被引量:8
  • 10白冰,李春峰.地铁列车振动作用下近距离平行隧道的弹塑性动力响应[J].岩土力学,2009,30(1):123-128. 被引量:40

共引文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部