高性能计算集群运行时环境的配置优化被引量：5

Runtime Environment Configuration and Optimization on High-performance Computing Clusters

导出

摘要本文关注如何在高性能计算集群上提供良好的运行时环境的问题,以使得并行应用程序获得更高的性能。指出了高性能计算集群运行过程中运行时环境配置优化需要考虑到的内容,包括跨节点资源的分配与选取、节点内进程及线程针对硬件资源的映射与绑定等两大类多方面因素,并分析了它们对并行应用程序性能带来的影响。通过在多个平台对基准程序和应用程序的实际测试来验证运行时环境对并行程序性能的影响,结果表明不同的运行时环境配置能够对应用程序造成约20%的性能差别。最后对运行时环境优化所需要进一步完成的各项具体工作进行了深入的讨论。 In this paper, it is discussed that how to provide good runtime environment for parallel applications so thus to obtain better performance on high-performance clusters. Some aspects within two categories are selected for analyzing their influence to parallel application performance, including inter-node resources allocation and selection and intra-node processes/threads mapping or binding with system resources. Several benchmarks and applications testing results on different platforms are given, which make it the evidence that runtime environment may cause -20% performance variance to parallel applications. Advanced discussion of the detail work need to be further performed is demonstrated at last.

作者曹宗雁

机构地区中国科学院计算机网络信息中心超级计算中心

出处《科研信息化技术与应用》 2011年第6期52-61,共10页 E-science Technology & Application

基金中国科学院"十一五"信息化专项"超级计算环境建设与应用"(INFO-115-B01) 国家863计划资助(2011AA01A205)

关键词高性能计算集群运行时环境性能优化协同设计 High-performance computing Cluster Runtime environment Performance optimization Co-design

分类号 TP311.13 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献11

1曹宗雁,牛铁,赵毅,朱鹏,迟学斌.基于通信优化的Infiniband集群MPI作业加载[J].计算机应用研究,2011,28(11):4256-4259. 被引量：1
2Xuan-Yi Lin,Yeh-Ching Chung,Tai-Yi Huang.A Multiple LID Routing Scheme forFat-Tree-Based InfiniBand Networks[].Proceedings of IEEE International Parallel and Distributed Processing Symposiums.2004
3David Culler,Richard Karp,David Patterson,et al.LogP: Towards a realistic model of parallel computation[].Proceedings of the Fourth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming.1993
4VISHNU A,KOOP M,MOODY A,et al.Topology agnostic hot-spotavoidance with Infiniband[].Concurrency and Computation:Practice and Experience.2008
5KOOP M,LUO M,PANDA D K,et al.Reducing Network Contention with Mixed Workloads on Modern Multicore Clusters[].IEEE International Conference on Cluster Computing and workshopsCLUSTER’’.2009
6IBM System x3950 M2 and x3850 M2 Technical Introduction. http://www.redbooks.ibm.com/redpapers/pdfs/redp4362.pdf . 2008
7.Intel MPI Benchmarks 3.2.2[]..2010
8.NAS Parallel Benchmarks[]..2011
9LONG W,WEILE J,XUEBIN C,et al.Large scale plane wave pseudopotential density functionaltheory calculations on GPU clusters[].International Conference for High Performance ComputingNetworkingStorage and AnalysisSC.2011
10.Intel 5520/5500 Chipset:Datasheet[]..2009

二级参考文献12

1Top 500 supercomputer sites [ EB/OL]. (2010-11-15) [2011-03- 30]. http ://www. top500, org/.
2LIN Xuan-yi, CHUNG Y C, HUANG Tai-yi. A muhiple LID routing scheme for fat-tree-based Infiniband networks[ C]//Proc of the 18th International Parallel and Distributed Processing Symposium. Washing ton DC : IEEE Computer Society, 2004 :1-20.
3VISHNU A, KOOP M, MOODY A, et al. Hot-spot avoidance with multi-pathing over Infiniband : an MPI perspective [ C ]// Proc of the 7th IEEE International Symposium on Cluster Computing and the Grid. Washington DC: IEEE Computer Society, 2007:479-486.
4VISHNU A, KeeP M, MOODY A, et al. Topology agnostic hot-spot avoidance with Infiniband [ J ]. Concurrency and Computation: Practice and Experience, 2008, 21 (3): 301-319.
5HOEELER T, GROPP W, THAKUR R, et al. Toward performance models of MPI implementations for understanding application scaling issues[ C]// Proc of the 17th European MPI Users' Group Meeting Conference on Recent Advances in the Message Passing Interface. [ S. l. ]: Springer-Verlag, 2010:21-30.
6JEANNOT E, MERCIER G. Near-optimal placement of MPI processes on hierarchical NUMA architectures [ C ]//Proc of the 16th International Euro-Par Conference on Parallel Processing: Part II. [ S. l. ] : Springer-Verlag, 2010 : 199- 210.
7HOEFLER T, RABENSEIFNER R, RITZDORF H, et al. The scalable process topology interface of MPI 2.2 [ J ]. Concurrency Computation: Practice and Experience, 2011,23 (4) : 293-310.
8METROPOLIS N, ROSENBLUTH A W, ROSENBLUTH M N, et al. Equation of state calculations by fast computing machines [ J]. The Journal of Chemical Physics, 1953, 21 (6) : 1087-1091.
9BROOKS S P, MORGAN B. Optimization using simulated annealing [J]. The Statistician,1995, 44(2): 241-257.
10The OpenFabrics Alliance [ EB/OL]. (2009- 12- 23 ) [ 2010- 04- 16 ]. http ://www. openfabrics, org/.

同被引文献36

1吴恩华.图形处理器用于通用计算的技术、现状及其挑战[J].软件学报,2004,15(10):1493-1504. 被引量：141
2张浩,李利军,林岚.GPU的通用计算应用研究[J].计算机与数字工程,2005,33(12):60-62. 被引量：24
3张庆丹,戴正华,冯圣中,孙凝晖.基于GPU的串匹配算法研究[J].计算机应用,2006,26(7):1735-1737. 被引量：15
4李建明,万单领,迟忠先,胡祥培.一种基于GPU加速的细粒度并行粒子群算法[J].哈尔滨工业大学学报,2006,38(12):2162-2166. 被引量：8
5Famhalian K,Houston M. A closer look at GPUs [J]. Commu- nications of the ACM, 2008, 51:50-57.
6Murata T. Petri nets: properties, analysis and applications[J]. Proceedings of the IEEE, 1989,77(4):541-580.
7Petrini F, Kerbyson D J, Pakin S. The case of the missing super- computer performance:achieving optimal performance on the 8, 192 processors of ASCI Q[C]//Proceedings of the 2003 ACM/ IEEE Conference on Supercomputing, 2003. Phoenix: ACM, 2003 : 55.
8Hu L,Gorton I. Performance evaluation for parallel systems:A Survey[R]. Sydney: University of NSW, 1997.
9Ciardo G, Cherkasova L, Kotov V, et al. Modeling a scalable high-speed interconnect with stochastic Petri nets[C]//Procee- dings of the Sixth International Workshop onPetri Nets and Per- formance Models, 1995. Durham: IEEE Computer Society Press, 1995:83-92.
10Jain R. The art of computer systems performance analysis:tech- niques for experimental design, measurement, simulation, and modeling[M]. New York:John Wiley & Sons, 1991.

引证文献5

1吴建.基于GPU的通用计算研究[J].中国科技博览,2012(19):422-423.
2李智佳,胡翔,焦莉,王伟锋.基于随机Petri网的高性能计算系统作业调度及InfiniBand网络互连的性能分析[J].计算机科学,2015,42(1):33-37. 被引量：2
3廖凯宁,郝永伟.地震数据处理和地震动力学仿真平台的优化研究[J].计算机工程与科学,2015,37(4):663-669. 被引量：2
4廖凯宁,郝永伟.地震灾情和地震动力学模拟系统[J].计算机系统应用,2015,24(6):57-61.
5高剑刚,郑岩,于康,彭达佳,李宏亮,刘勇,何王全,陈德训,王飞.神威超级计算机运行时故障定位方法[J].计算机研究与发展,2024,61(1):86-97.

二级引证文献4

1刘桂平,李闽峰,李圣强,王斌.地震预测重点实验室高性能计算平台计费服务系统的设计与实现[J].震灾防御技术,2018,13(4):978-985. 被引量：2
2赵义军,张小轩.含有效冲突的恒定连续Petri网演变图及构造算法[J].计算机科学,2016,43(11):98-101.
3高永国,邓津.甘肃省地震局高性能计算系统[J].地震地磁观测与研究,2018,39(1):149-153. 被引量：1
4姜红.计算机网络和计算机系统的性能评价[J].电子技术（上海）,2022,51(4):178-179.

1邢永丽.计算机网络中平均时延的计算[J].石家庄经济学院学报,1998,21(2):208-212.
2刘横.合成基准程序的两个实例[J].中国计算机用户,1989(12):30-34.
3李培林.放心地离开你的电脑[J].电脑技术——Hello-IT,2005(11):38-38.
4刘横,叶亚明.基准程序测试系统[J].计算机工程与应用,1991,27(9):1-7.
5李培林.放心地离开你的电脑[J].办公自动化,2006(2):28-28.
6董长虹.移动:无线胜有线[J].中国计算机用户,2001(21):8-9.
7黄强.Platform LSF软件在高性能计算集群中的实施与应用[J].教练机,2011(4):62-65. 被引量：1
8刘横.成组基准程序举例[J].中国计算机用户,1989(12):28-30.
9火凤凰.安全不安全先试再决定[J].电脑爱好者,2014(4):34-35.
10韩桂杰.计算机病毒及其防御[J].机电一体化,2014,20(2):1-1.

科研信息化技术与应用

2011年第6期

浏览历史

内容加载中请稍等...

高性能计算集群运行时环境的配置优化被引量：5

参考文献11

二级参考文献12

同被引文献36

引证文献5

二级引证文献4

相关作者

相关机构

相关主题

浏览历史

高性能计算集群运行时环境的配置优化 被引量：5

参考文献11

二级参考文献12

同被引文献36

引证文献5

二级引证文献4

相关作者

相关机构

相关主题

浏览历史

高性能计算集群运行时环境的配置优化被引量：5