期刊文献+

Infiniband网络中All_to_All通信性能优化

Optimizing All_to_All Communication in Infiniband
下载PDF
导出
摘要 All_to_All操作是一种重要的集合操作.目前的商用Infiniband网络中没有有效的拥塞控制机制.通过实验研究了2种典型的All_to_All算法在Infiniband网络中的性能,发现这些算法在传输大于32KB的大消息时会在网络中产生严重的拥塞,从而导致网络带宽利用率仅有30%~70%.尝试通过将大消息拆分成小消息、调度小消息的发送来减少网络拥塞.在任意2对进程间都建立可靠的连接,为每个连接都维护一个正在处理的发送请求计数器.当该计数器超过某个阈值后,认为这2个进程间的通信链路上发生了拥塞,此时停止向该连接的发送队列投递新的发送请求,以避免拥塞扩散到整个网络.实验结果表明该优化算法可以改善网络的拥塞程度;相比现有算法带宽利用率可以提高10%以上,最多可以提高20%. All _to_ All operation is an important collective function. No effective congestion control mechanism exists in current commercial Infiniband networks. Two typical All to All algorithms, the make-pair algorithm and the post-all algorithm, are studied in this paper. We found that their utilization of network bandwidth were between 30% and 70% when sending the messages which are larger than 32 KB. Further analysis and experiments demonstrate that it is the result of heavy congestion in the networks. In this paper, we adopt a novel algorithm to alleviate the congestion. It splits the large message into many small sized messages and sends them independently by an efficient schedule scheme. It creates one reliable connection for each process pair, and maintains a counter for each connection, which counts for the outstanding send requests. When the counter exceeds a predefined threshold value, congestion occurs between the two processes. Then it pauses the posting of send requests to the send queue of the congested connection in order to avoid the congestion spreading out through the entire network. The experiment results demonstrate that the new algorithm can alleviate the congestion effectively and improve All_ to _All operation performance. Compared with the existed algorithms, its utilization of network bandwidth can improve 10% at least and 20% at most.
出处 《计算机研究与发展》 EI CSCD 北大核心 2014年第8期1863-1870,共8页 Journal of Computer Research and Development
基金 国家科技支撑计划基金项目(2011BAH04B03)
关键词 All_to_All算法 拥塞控制 消息拆分 消息调度 INFINIBAND All_ to_ All algorithm congestion control message split message schedule Infiniband
  • 相关文献

参考文献10

  • 1Thakur R, Rabenseifner R, Gropp W. Optimization of collective communication operation in MPICH [J]. International Journal of High Performance Computing Applications, 2005, 19(1): 49-66.
  • 2Sur S, Jin H W, Panda D K. Efficient and scalable All-to-All personalized exchange for InfiniBand based clusters [C] // Proc of the 2004 Int Conf on Parallel Processing. Los Alamitos, CA: IEEE Computer Society, 2004:275-282.
  • 3Mamidala A R. Scalable and high performance collective communication for next generation multieore Infiniband clusters [D]. Columbus, OH:Ohio State University, 2008.
  • 4Infiniband Trade Association. Infiniband architecture specification volume 1, Release 1.2.1 [EB/OL]. (2007-12-06) [2013-02-15]. http://www. infinibandta. org.
  • 5Top500 Supercomputer sites. Top500 list-June 2012 [EB/OL]. [2013-02-15]. http://www. top500. org/list/2012/06/.
  • 6Pfister G, Gustat M, Denzel W, et al. Solving hot spot contention using Infiniband architecture congestion control [C/OL]//Proc of the High Performance Distributed Computing, 2005. Los Alamitos, CA: IEEE Computer Society, 2005 [2013-02-15]. http://www. ceres. gateeh. edu/ hpidc2005/presentations/GregPfister. pdf.
  • 7Hoefler T, Schneider T, Lumsdaine A. Optimized routing for large scale Infiniband networks [C]//Proc of the 2009 17th IEEE Symp on High Performance Interconnects. Los Alamitos, CA: IEEE Computer Society, 2009:103-111.
  • 8Vishnu A, Koop M, Moody A, et al. Hot Spot avoidance with multi-pathing over InfiniBand: An MPI perspective [C]//Proc of the 7th IEEE Int Syrup on Cluster Computing and the Grid. Los Alamitos, CA: IEEE Computer Society, 2007, 479-486.
  • 9Subramoni H, Lai P, Sur S, et al. Improving application performance and predictability using multiple virtual lanes in modern multi-core Infiniband clusters [C] //Proc of the 39th Int Conf on Parallel Processing. Los Alamitos, CA: IEEE Computer Society, 2010: 462-471.
  • 10饶立,张云泉,李玉成.国产百万亿次机群系统Alltoall性能测试与分析[J].计算机科学,2010,37(8):186-188. 被引量:3

二级参考文献10

  • 1姚继锋,孙家昶.平行十二面体区域上的快速离散傅立叶变换及其并行实现[J].数值计算与计算机应用,2004,25(4):303-314. 被引量:6
  • 2陈靖,张云泉,张林波,袁伟.一种新的MPI Allgather算法及其在万亿次机群系统上的实现与性能分析[J].计算机学报,2006,29(5):808-814. 被引量:8
  • 3Thakur R,Rabenseifner R,Gropp W.Optimization of Collective Communication Operations in MPICH[J].International Journal of High Performance Computing Applications,2005,1(19):49-66.
  • 4Faraj A,Yuan Xin.An Empirical Approach for Efficient All-to-All Personalized Communication on Ethernet Switched Clusters[Z].ICPP,2005:321-328.
  • 5MPICH-A portable implementation of MPI[OL].http://www.mcs.anl.gov/mpi/mpich.
  • 6LAM/MPI Parallel Computing.http://www.lam-mpi.org/.
  • 7Kale L V,Kumar S,Vardarajan K.A framework for collective personalized communication[C] ∥Proceedings of the 17th International Parallel and Distributed Processing Symposium(IPDPS '03).2003.
  • 8Bruck J,Ho C-T,Kipnis S,et al.Efficient algorithms for all-to-all communications in multiport messagepassing systems[J].IEEE Transactions on Parallel and Distributed Systems,1997,8(11):1143-1156.
  • 9Hoefler T,Lumsdaine A,Rehm W.Implementation and performance analysis of non-blocking collective operations for MPI[C] ∥Proceedings of the 2007 ACM/IEEE conference on Supercomputing.Reno,Nevada,November 2007.
  • 10Jiachang Sun(Parallel Computing Division, Institute of Software, Chinese Academy of Sciences, Beijing 100080, China).MULTIVARIATE FOURIER SERIES OVER A CLASS OF NON TENSOR-PRODUCT PARTITION DOMAINS[J].Journal of Computational Mathematics,2003,21(1):53-62. 被引量:25

共引文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部