期刊文献+

非连续数据网络通信实现方法和性能分析 被引量:9

Implementation Methods and Performance Analysis of Non-Contiguous Data Communication in Network
下载PDF
导出
摘要 非连续数据通信是指发送端将位于不同地址的多块数据传输到接收端的多个非连续地址.这种通信模式在科学计算应用中十分常见,如求解计算、FFT计算、流体力学模拟等应用均涉及矩阵的转置传输,多维矩阵的子矩阵传输,非结构化数据访问等非连续数据通信.所以,非连续数据的通信性能对众多科学计算应用有重要的影响.目前,有多种实现非连续数据通信的卸载或者非卸载的方法,但是迄今没有工作在同一平台对主流的非连续数据通信实现方法进行评测和分析,也没有工作对每一种实现方式适用的情况进行总结.本文首先总结了目前非连续数据通信的实现方式,然后,本文使用已有的测试集和自己设计的测试集对不同方式的非连续数据通信性能进行了详细的对比测试,细粒度地分析了在不同数据分布的情况下数据拷贝和RDMA通信的开销,尤其对基于RDMA sg_list(scatter-gather list)和UMR(User-mode Memory Registration)功能的卸载性能进行了分析,并总结了各种非连续数据通信方式的适用情况和存在的问题.最后,本文通过实验验证了分析结果的正确性,并对于该分析结果相关的技术提出了优化的方向. Non-contiguous data communication means that the sender transfers multiple blocks of data at discontinuous addresses to multiple memory regions with discontinuous addresses on the receiver.This communication model is common in scientific computing applications,such as solution calculation,FFT calculation,fluid dynamics simulation,etc.These applications include data transfer operations such as the transfer of matrix transpose,the transfer of submatrix of 2D,3D and 4D matrices,unstructured data access,and other non-contiguous data communication.Therefore,the communication performance of non-contiguous data has important influences on many scientific computing applications.Currently,there are some offloading or non-offloading methods to achieve non-contiguous data communication,but no one has measured these kinds of methods on one platform.Further,nobody has analyzed or proposed a guideline of which situation that each method is suitable for before now.We give the summary of implementation methods and performance analysis of non-contiguous data communication in this paper.The main contributions of this paper include:(1)a comprehensive summary of the implementation methods;(2)a series of detailed performance experiments using dependable micro-benchmarks and applications by different methods;(3)comparative analysis and useful conclusion for our experimental platform;(4)potential problems and research points of non-contiguous data communication in future work.Firstly,this paper summarizes the current implementation methods of non-contiguous data communication.The non-offloading method consists mainly of manual copy and some callable interfaces like MPI DDT(Message Passing Interface Derived Data Type)based on data movement in memory.The offloading method includes different implementations that make use of RDMA(Remote Direct Memory Access)technology to achieve different degree of decreasing data copy.After that,we use both existing benchmarks and self-designed benchmarks to measure the performance of non-contiguous data communication in different ways as we summarize in detail.All these experiments are completed on the same experimental platform for comparison and analysis.We also give the fine-grained analysis of the overhead of data copying and RDMA communication in the case of different data distributions.Especially,we relatively analyze the offloading performance based on RDMA sg_list(scatter gather list)and the offloading performance based on UMR(User-mode Memory Registration)functions,and conclude the applicable situations and potential problems of various methods of non-contiguous data communication.We list the data table as a guideline for non-contiguous data communication on our platform.We find that RDMA offloading methods truly have an advantage over memory copy in performance when the block size is large,but some problems still exist.The low efficiency of UMR MTT(Memory Table Translation)may cause performance degradation when block number becomes large.Finally,this paper verifies the correctness of the analysis results through micro-application experiments,and proposes the optimization direction of the technology related to the analysis results.
作者 马潇潇 陆钢 付斌章 安仲奇 朱泓睿 邵恩 王展 安学军 MA Xiao-Xiao;LU Gang;FU Bin-Zhang;AN Zhong-Qi;ZHU Hong-Rui;SHAO En;WANG Zhan;AN Xue-Jun(Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100190;University of Chinese Academy of Sciences,Beijing 100080;Russell Lab,HUAWEI Technology Co.,Ltd.,Beijing 100080)
出处 《计算机学报》 EI CSCD 北大核心 2020年第6期1123-1138,共16页 Chinese Journal of Computers
基金 国家重点研发计划课题(2018YFB0204400) 十三五国家重点研发计划课题(2016YFB0200205) 国家自然科学基金青年基金项目(61702484) 中国科学院先导B类专项课题(XDB24050100) 华为技术有限公司合作项目(YBN2018015521)资助.
关键词 非连续数据通信 远程数据直接访问卸载 SGRS UMR 人工打包拆包 non-contiguous data communication remote direct memory access offloading sender gather receiver scatter user-mode memory registration manual packing/unpacking
  • 相关文献

同被引文献80

引证文献9

二级引证文献23

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部