摘要
持久性内存技术与远程直接内存访问(remote direct memory access,RDMA)技术的发展,为高效分布式系统的设计提供了新的思路.然而,现有的基于RDMA的分布式系统没有充分利用RDMA的多播能力,难以解决1对多传输场景下的多拷贝文件数据传输问题,严重影响了系统性能.针对此问题,提出一种基于RDMA多播机制的分布式持久性内存文件系统(RDMA multicast transmission based distributed persistent memory file system,MTFS),通过低延迟多播通信机制充分利用RDMA多播能力,将数据高效传输到多个数据节点,从而避免了多拷贝传输操作带来的高延迟.为提升传输操作灵活性,MTFS设计了多模式多播远程过程调用(remote procedure call,RPC)机制,实现了RPC请求自适应识别,并通过优化返回机制将部分传输操作移出关键路径,进一步提升传输效率.同时MTFS提供了轻量级一致性保障机制,通过设计故障恢复功能、数据校验系统、重传策略与窗口机制,当节点出现崩溃时进行快速恢复,并在传输出现错误时实现数据精准检测与纠正,保证了数据的可靠性和一致性.实验证明,MTFS在各测试集上相比现有系统GlusterFS吞吐量提升了10.2~219倍.在Redis数据库的工作负载下,MTFS相比于NOVA取得了最高10.7%的性能提升,并在多线程测试中取得了良好的可扩展性.
The development of persistent memory and remote direct memory access(RDMA)provides new opportunities for designing efficient distributed systems.However,the existing RDMA-based distributed systems are far from fully exploiting RDMA multicast capabilities,which makes them difficult to solve the problem of multi-copy file data transmission in one-to-many transmission,degrading system performance.In this paper,a distributed persistent memory and RDMA multicast transmission based file system(MTFS)is proposed.It efficiently transmits data to different data nodes by the low-latency multicast transmission mechanism,which makes full use of the RDMA multicast capability,hence avoiding high latency due to multi-copy file data transmission operations.To improve the flexibility of transmission operations,a multi-mode multicast remote procedure call(RPC)mechanism is proposed,which enables the adaptive recognition of RPC requests,and moves transmission operations out of the critical path to further improve transmission efficiency.MTFS also provides a lightweight consistency guarantee mechanism.By designing a crash recovery mechanism,a data verification module and a retransmission scheme,MTFS is able to quickly recover from a crash,and achieves file system reliability and data consistency by error detection and data correction.Experimental results show that MTFS has greatly increased the throughput by 10.2-219 times compared with GlusterFS.MTFS outperforms NOVA by 10.7% on the Redis workload,and achieves good scalability in multi-thread workloads.
作者
陈茂棠
郑圣安
游理通
王晶钰
闫田
屠要峰
韩银俊
黄林鹏
Chen Maotang;Zheng Sheng'an;You Litong;Wang Jingyu;Yan Tian;Tu Yaofeng;Han Yinjun;Huang Linpeng(Department of Computer Science and Engineering,Shanghai Jiao Tong University,Shanghai 200240;Department of Computer Science and Technology,Tsinghua University,Beijing 100084;ZTE Corporation,Nanjing 210012)
出处
《计算机研究与发展》
EI
CSCD
北大核心
2021年第2期384-396,共13页
Journal of Computer Research and Development
基金
国家重点研发计划项目(2018YFB1003302)
上海交通大学-华为联合实验室项目(FA2018091021-202004)。