摘要
针对淘宝分布式系统(TFS)数据容灾效率低且成本高的问题,提出了基于低密度随机纠删码的解决方案。该方案引入了一种新型的高性能纠删码(SRM码),对TFS中存放原数据的多个block块进行编码,生成的冗余信息存放在新的block块中以便进行数据恢复。与三副本容灾策略不同的是,该方案将TFS中存储数据的每个block块视为一个信息单位来进行容灾,当集群中某些block块出现异常或失效时,可使用SRM码的编码矩阵对其他相关的block块建立译码方程进行求解,从而恢复出丢失的数据。通过相关集群实验表明,该容灾机制在存储成本上比复制的方法节省了60~70%的空间,容灾效率以及扩展能力上也领先于其他纠删码方法,对TFS作出了很大的优化。
Concerning the low efficiency and high cost of data disaster recovery in TaoBao File System (TFS), an improved solution based on low-density stochastic erasure coding was presented. The solution brought in a new kind of high- performance erasure code called Stochastic Random Matrix (SRM) to encode the blocks storing original data in TFS, and the redundant data acquired would be stored in other new blocks. Different from the three-copy strategy adopted by the original system, this module viewed the block in TFS as a encoding unit. When some blocks failed, the relevant blocks would be collected to establish decoding equation for recovery by SRM coding matrix. In the end, some experiments on the analogous clusters of TFS showed that the method could save about 60% to 70% storage space than three-copy strategy and its efficiency and feasibility were ahead of other erasure coding methods so that much contribution was made to TFS.
出处
《计算机应用》
CSCD
北大核心
2016年第A02期66-68,81,共4页
journal of Computer Applications
基金
国家自然科学基金资助项目(61501064)
四川省科技厅支撑计划项目(2015GZ0088)
关键词
淘宝分布式文件系统
纠删码
数据容灾
TaoBao File System (TFS)
erasure code
data disaster recovery