期刊文献+

T-NBC:透明的MPI非阻塞集合操作 被引量:3

T-NBC: Transparent Non-blocking MPI Collective Operations
下载PDF
导出
摘要 在不修改应用程序的前提下,在MPI通信库中将阻塞的集合操作转化为非阻塞的实现可以将集合通信与紧跟在集合操作之后的计算重叠起来,从而提高应用的性能.在应用中,集合操作之后的计算包括集合通信无关的计算和集合通信相关的计算两类.集合通信可以与前者很好地重叠;由于后者需要访问通信数据,与后者的重叠和集合通信中多个集合子消息的通信顺序密切相关.在该文中,我们实现了对应用透明的非阻塞集合操作T-NBC(Transparent Non-Blocking Collective operations).T-NBC不但将集合通信与集合通信无关的计算充分重叠起来,而且为了进一步增大集合通信与集合通信相关计算的重叠,它可根据应用访问多个集合子消息的顺序赋予这些子消息不同的通信优先级.微基准测试显示,T-NBC可以将绝大部分的集合通信与集合操作之后的计算重叠起来.在NPB(NAS Parallel Benchmarks)测试FT(Fourier Transform)和IS(Integer Sort)中,尽管集合操作之后的计算主要为集合通信相关的计算,但很大部分的集合通信时间被重叠,它们的性能分别提高了5%和36%. Without modifying MPI applications,transparently translating MPI collective operations into non-blocking ones in communication libraries can overlap collective communication with the computation following the operations and benefit most current applications.In applications,the following computation includes communication-unrelated computation(CURC) and communication-related computation(CRC).CURC is easier to overlap with collective communication;however,CRC need access communication data and is more difficult to overlap with collective communication.In the paper,we propose transparent non-blocking collective operations(T-NBC).It can obtain the overlap between collective communication and following communication.Besides the overlap with CURC,it improves the overlap with CRC by transmitting collective messages with different priorities according to their accessed sequence in applications.Evaluations of micro-benchmark demonstrate that a large potential overlap between collective communication and following computation can be obtained.In FT(Fourier Transform) and IS(Integer Sort) of NPB(NAS Parallel Benchmarks),even following computation dominated by CRC,a large portion of collective communication is overlapped.Their performance is respectively improved by 5% and 36%.
出处 《计算机学报》 EI CSCD 北大核心 2011年第11期2052-2063,共12页 Chinese Journal of Computers
基金 国家"八六三"高技术研究发展计划项目"曙光6000千万亿次高效能计算机系统研制"(2009AA01A129)资助~~
关键词 透明 非阻塞 集合操作 重叠 优先级 transparent non-blocking collective communication overlap priorities
  • 相关文献

参考文献20

  • 1Abdelrahman T S, Liu O. Overlap of computation and com- munication on shared-memory networks-of-workstations//Proceedings of the Cluster Computing. California, USA, 2001:35-45.
  • 2Calland P-Y, Dongarra J, Robert Y. Tiling on systems with communication/computation overlap. Concurrency Practice and Experience, 1999, 11(3): 139-153.
  • 3Culler D, Karp R, Patterson D, Sahay A, Schauser K E, Santos E, Subramonian R, yon Eicken T. LogP: Towards a realistic model of parallel computation//Proceedings of the Principles Practice of Parallel Programming. San Diego, Canada, 1993: 1-12.
  • 4Hoefler T, Lumsdaine A, Rehm W. Implementation and performance analysis of non-blocking collective operations for MPI//Proceedings of the 2007 International Conference on High Performance Computing, Networking, Storage and Analysis, SC07. Reno, USA, 2007: 52-61.
  • 5Arkady Kanevsky, Anthony Skjellum, Anna Rounbehler. MPI/RT- an emerging standard for high-performance real- time systems//Proceedings of the HICSS. Hawaii, USA, 1998:157-166.
  • 6Hoefler T, Gottschling P, Lumsdaine A, Rehm W. Optimi zing a conjugate gradient solver with non-blocking collective op erations. Elsevier Journal of Parallel Computing (PARCO) 2007, 33(9): 624-633.
  • 7Hoefler T, Kambadur P, Graham R L, Shipman G, Lums daine A. A case for standard non-blocking collective opera tions//Proceedings of the PVM/MPI. Paris, France, 2007 125-134.
  • 8Gropp William, Lusk Ewing, Skiellum Anthony. Using MPI: Portable Parallel Programming with the Message-Pass- ing Interface. Cambridge, MA, USA: MIT Press Scientificand Engineering Computation Series, 1995.
  • 9Gropp William, Lusk Ewing, Skjellum Anthony. Using MPI-2 : Advanced Features of the Message Passing Interface. Cambridge, MA, USA: MIT Press Scientific and Engineering Computation Series, 1999.
  • 10Keleher P, Cox A, Swarkadas S, Zwaenepoel W. Tread- Marks: Distributed shared memory on standard workstations and operating systems//Proceedings of the 1994 Winter USENIX Conference. San Francisco, USA, 1994:115-132.

同被引文献11

  • 1Philip Patchin H, Andr6s Lagar-Cavilla, Eyal de Lara. Adding the Easy Button to the Cloud with SnowFlock and MPI[ C]//Proceedings of the 3rd ACM Workshop on System-level Virtualization for High Perform- ance Computing,2009. ACM, New York, USA 2009 : 1 - 8.
  • 2H AndrOs Lagar-Cavilla, Joseph A Whitney, Adin Scannell. SnowFlock : Rapid Virtual Machine Cloning for Cloud Computing [ C ]. EuroSys, 2009. Nuremberg, Germany, 2009.
  • 3Maloy J, Stephens A. TIPC : Transparent Inter Process Communication Protocol[ OL]. 2010 - 10. http://tipc, sf. nei.
  • 4Min Choi, DaeWoo Lee, Seung Ryoul Maeng. Cluster computing envi- ronment supporting single system image[ C ]//CLUSTER 2004. San Di- ego, California, USA. 2004:235 - 243.
  • 5Pinheiro E. Truly-transparent checkpointing of parallel applications[ D ]. Federal University of Rio de Janeiro UFRJ ,2001.
  • 6Renaud Lottiaux, Pascal Gallard, Geoffroy Vallee. OpenMosix, OpenSSI and Kerrighed : a comparative study [ C ]//Sth International Symposium on Cluster Computing and the Grid (CCGrid)2005. Cardiff, UK, 2005:1016 - 1023.
  • 7Christine Morin, Renaud Lottiaux, Geoffroy Valle. Kerrighed and data parallelism : Cluster computing on single system image operating systems [ C]//Proceedings of the 2004 IEEE International Conference on Clus- ter Computing 2004. San Diego, California, USA, September 2004.
  • 8Jiuxing Liu ,Jiesheng Wu,Sushmitha P Kini. High performance RDMA- based MPI implementation over InfiniBand [ C ]//ICS' 03 Proceedings of the 17th annual international conference on Supercomputing,2003. New York, USA 2003:295 - 304.
  • 9刘天田,杨升春,欧中红,袁由光.基于消息传递并行进程迁移技术的研究与实现[J].计算机科学,2009,36(4):166-168. 被引量:5
  • 10冀映辉,蔡炜,蔡惠智.TIPC透明进程间通信协议研究和应用[J].计算机系统应用,2010,19(3):76-79. 被引量:5

引证文献3

二级引证文献15

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部