期刊文献+

大规模结构有限元分析程序在多核集群计算环境中的性能分析和优化 被引量:2

Performance Analysis and Tuning of Large-scale Finite Element Analysis Program on Multi-core Cluster Platform
下载PDF
导出
摘要 通过对基于MPI编程模型实现的开源有限元计算分析软件在多核集群计算平台中的程序性能的分析,找出程序瓶颈及其原因,实现了基于MPI编程模型的并行程序在多核计算环境中的性能优化。根据程序性能瓶颈的分析,提出了基于MPI/OpenMP混合并行编程模型的大规模线性/非线性方程组求解和多线程多进程同时进行消息通信的两种程序性能优化方案。不同计算规模的实验结果表明,在多核集群计算平台中,MPI/OpenMP混合编程模型实现的大规模非线性方程组求解器相对于单纯基于MPI编程模型实现的并行程序,其性能有2倍到3倍的提升;多线程多进程同时消息传递的优化方案虽然对程序能够起到性能优化作用,但是对解决程序消息通信瓶颈的问题不是最好的方法。两个方案总体性能分析结果表明,基于MPI/OpenMP混合编程模型实现的并行程序,在多核集群计算平台中能够更好地发挥硬件系统的计算能力。 Through the performance analysis of an open source finite element software based on MPI program model on multi-core cluster platform,some performance bottlenecks were founded.Based on the performance bottleneck analysis,two optimization plans based on MPI/OpenMP hybrid parallel program model were proposed,one of them resolves the inefficiency in solving linear or nonlinear system equations,and the other one elevates processes communication performance.Experiment results show that hybrid parallel solver can efficiently promote the pure MPI based parallel program performance,as up to 3 times.The multi-thread multi-process communication plan can do some optimization,but is not the best solution in this case.The overall optimized performance analysis indicates that on multi-core cluster computing platform,MPI/OpenMP parallel program model can more efficiently utilize hardware system computation resource.
出处 《计算机科学》 CSCD 北大核心 2012年第1期305-310,共6页 Computer Science
关键词 MPI/OpenMP OpenSeesSP 多核 非线性方程组求解 MPI/OpenMP OpenSeesSP Multi-core Nonlinear equations
  • 相关文献

参考文献19

  • 1Bathe K J. Finite Element Procedures[M]. Prentice hall of india New Delhi, 1997.
  • 2Bathe K-J. Finite Element Procedures[M]. Upper Saddle River, NJ: Prentice Hall, 1996.
  • 3Fung Y C. Foundations of Solid Mechanics [M]. Englewood Cliffs, NJ: Prentice-Hall, 19 6 5.
  • 4Adhianto L, Chapman 13. Performance modeling of communica- tion and computation in hybrid MPI and OpenMP applications [C]//Proceedings of the 12th International Conference on Parallel and Distributed Systems. IEEE Computer Society, 2006 : 3-8.
  • 5Top 500 Super Computer Sites[OL]. http://www, top500, org/.
  • 6Bova S W, Breshears C P, Gabb H, et al. Parallel programming with message passing and direetives[J]. Computing in Science and Engineering, 2001,3(5) : 22-37.
  • 7Mazzoni S, McKenna F, Scott M H, et al. OpenSees Command Language Manual[R]. PEER. University of California Berkeley, 2004.
  • 8涂碧波,邹铭,詹剑锋,赵晓芳,樊建平.多核处理器机群Memory层次化并行计算模型研究[J].计算机学报,2008,31(11):1948-1955. 被引量:16
  • 9Adhianto L. A New Framework for Analyzing, Modeling and Optimizing MPI and/or OpenMP Applications[D]. Dissertation of the Degree Doctor of Philosophy University of Houston, 2007:1-23.
  • 10胡晓力,田有先.多粒度并行计算集群研究与应用[J].电力学报,2007,22(4):436-438. 被引量:5

二级参考文献38

  • 1陈勇,陈国良,李春生,何家华.SMP机群混合编程模型研究[J].小型微型计算机系统,2004,25(10):1763-1767. 被引量:19
  • 2迟学斌.Transputer上Cholesky分解的并行实现[J].计算数学,1993,15(3):289-294. 被引量:4
  • 3洪雄,戴光明,冷春霞.构架Linux环境下基于MPICH的工作站机群[J].微计算机信息,2006,22(03X):124-126. 被引量:10
  • 4Hwang Kai著,王鼎兴等译.高等计算机系统结构--并行性、可扩展性、可编程性.北京:清华大学出版社,1995
  • 5Zhang Yun-Quan, Chen Guo-Liang, Sun Guang-Zhong, Miao Qian-Kun. Models of parallel computation: A survey and classification. Frontiers of Computer Science in China, 2007, 1(2): 156-165
  • 6Krste Asanovic, Ras Bodik, Bryan Christopher Catanzaro et al. The landscape of parallel computing research: A view from Berkeley. Electrical Engineering and Computer Sciences, University of California at Berkeley: Technical Report No: UCB/EECS-2006-183, 2006
  • 7Cameron K, Sun X H. Quantifying locality effect in data access delay: Memory LogP//Proceedings of the 2003 IEEE International Parallel and Distributed Processing Symposium (IPDPS'03). Nice, France, 2003:212-219
  • 8Cameron Kirk W, Ge Rong, Sun Xian-He. LognP and Log3P: Accurate analytical models of point-to-point communication in distributed systems. IEEE Transactions on Computers, MARCH 2007, 56(3): 314-327
  • 9Chai Lei, Gao Qi, Panda Dhabaleswar K. Understanding the impact of multi-core architecture in cluster computing: A case study with Intel dual-core system//Proceedings of the 7th IEEE International Symposium on Cluster Computing and the Grid(CCGrid'07). Rio de Janeiro, Brazil, 2007:471-478
  • 10Alam Sadaf R, Barrett Richard F, Kuehn Jeffery A, Roth Philip C, Vetter Jeffrey S. Characterization of scientific workloads on systems with multi-core processors//Proceedings of the International Symposium on Workload Characterization. Los Alamitos, CA, USA, 2006:225-235

共引文献69

同被引文献10

引证文献2

二级引证文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部