期刊文献+

纠删码存储系统中基于网络计算的高效故障重建方法 被引量:6

An Efficient Failure Reconstruction Based on In-Network Computing for Erasure -Coded Storage Systems
下载PDF
导出
摘要 目前分布式存储系统的规模越来越大,不论存储设备是磁盘还是固态盘,系统都始终面临着数据丢失的风险.传统分布式存储系统大多采用基于三副本的高可靠性技术,但为了追求较低的存储开销,大量系统正在转向基于纠删码的可靠性方法.但是在纠删码方案下,重建故障数据需要读取多个存储设备,这将导致大量的网络传输和存储I/O操作,增大系统恢复开销.为了能够在不损失其他性能的同时降低恢复开销,利用软件定义网络(software defined networking, SDN)技术,提出一种基于网络计算的高效故障重建方案——网络流水线(in-network pipeline, INP),其中SDN控制器利用网络的全局拓扑信息构造重建树,系统依据重建树进行数据传输,并在交换机上完成部分计算,减少向后传输的网络流量,从而消除网络瓶颈,提升恢复性能.测试评估了不同网络带宽下INP的恢复效率.实验结果表明:与传统的纠删码系统相比,INP总是能大幅减少网络流量,并且在一定带宽条件下,能够接近正常读的时间开销. Nowadays,the scale of distributed storage systems is getting increasingly larger.No matter whether the storage devices are disks or solid-state drives,the system is always faced with the risk of data loss.Traditional storage systems maintain three copies of each data block to ensure high reliability.Today,a number of distributed storage systems are increasingly shifting to the use of erasure codes because they can offer higher reliability and lower storage overhead.The erasure codes,however,have an obvious shortcoming in the reconstruction of an unavailable block,because they need to read multiple disks,which results in a large amount of network traffic and disk operations and ultimately high recovery overhead.In this paper,INP(in-network pipeline),an effective failure reconstruction scheme based on in-network computing that utilizes SDN(software defined networking)technology is presented in order to reduce the overhead of recovery without sacrificing any other performance.We use the global topology information for network from SDN controller to establish the tree of reconstruction,and transmit data according to it.The switches do part of the calculation that can reduce the network traffic,therefore to eliminate the bottleneck of the network,and to enhance the recovery performance.We evaluate the recovery efficiency of INP in different network bandwidths.Compared with the common erasure code system,it greatly reduces the network traffic and in a certain bandwidth,the degraded read time is the same as that of normal reading.
作者 唐英杰 王芳 谢燕文 Tang Yingjie;Wang Fang;Xie Yanwen(Wuhan National Laboratory for Optoelectronics(Huazhong University of Science and Technology),Wuhan 430074;Key Laboratory of Information Storage System(Huazhong University of Science and Technology),Ministry of Education,Wuhan 430074;Shenzhen Huazhong University of Science and Technology Research Institute,Shenzhen,Guangdong 518000)
出处 《计算机研究与发展》 EI CSCD 北大核心 2019年第4期767-778,共12页 Journal of Computer Research and Development
基金 国家自然科学基金项目(61772216) 武汉应用基础研究计划项目(2017010201010103) 深圳市科技计划项目(JCYJ20170307172248636) 中央高校基本科研业务费专项资金 国防预研项目(31511010202)~~
关键词 分布式存储系统 纠删码 软件定义网络 恢复开销 网络计算 distributed storage system erasure code software defined networking(SDN) recovery overhead in-network computing
  • 相关文献

参考文献1

二级参考文献38

  • 1Layman P, Varian H R. How much information 2003? [EB/OL]. [2010 10-18]. http://www2, sims. berkeley. edu/research/proiects/how-mueh-info-2003.
  • 2Pinheiro E, Weber W D, Barroso L A. Failure trends in a large disk drive population [C] //Proc of the 5th USENIX Conf on File and Storage Technologies. Berkeley, CA: USENIX Association, 2007 : 17-28.
  • 3Schroeder B, Gibson G A. Disk failures in the real world: What does an MTTF of 1,000,000 hours mean to you? [C] //Proc of the 5th USENIX Conf on File and Storage Technologies. Berkeley, CA: USENIX Association, 2007: 1-16.
  • 4Bairavasundaram L N, Goodson G R, Pasupathy S, et al. An analysis of latent sector errors in disk drives [C]//Proc of 2007 ACM SIGMETRICS Int Conf on Measurement and Modeling of Computer Systems. New York: ACM, 200: 289-300.
  • 5Hafner J M, Deenadhayalan V, Rao K, et al. Matrix methods for lost data reconstruction in erasure codes [C] // Proc of the 4th USENIX Conf on File and Storage Technologies. Berkeley, CA: USENIX Association, 2005: 183-196.
  • 6Hafner J M, Deenadhayalan V, Kanungo T, et al. Performance metrics for erasure codes in storage systems, RJ 10321 [R]. San Jose, [A] IBM Research, 2004.
  • 7Li M, Shu J, Zheng W. GRID Codes: Strip based erasure codes with high fault tolerance for storage systems [J].ACM Transon Storage, 2009, 4(4): 1-22.
  • 8Blaum M, Brady J, Bruek J, et al. EVENODD: An efficient scheme for tolerating double disk failures in RAID architectures [J].IEEE Trans on Computer, 1995, 44 (2) 192-202.
  • 9Corbett P, English B, Goel A, et al. Row-diagonal redundant for double disk failure correction [C] //Proc of the 3rd USENIX Conf on File and Storage Technologies. Berkeley, CA: USENIX Association, 2004:2-15.
  • 10Xu L, Bruck J. X-code: MDS array codes with optimal encoding[J]. IEEE Trans on Information Theory, 1999, 45 (1) : 272-276.

共引文献90

同被引文献31

引证文献6

二级引证文献31

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部