摘要
在大规模分布式存储系统中,为了保证数据的可用性和可靠性,需要对数据进行一定的冗余存储。当节点失效后,有必要对失效节点所存储的数据进行修复以提供数据的可用性保证。然而,由于节点失效行为的不可预测性,何时对数据进行修复成为难题。目前,许多系统采用了立即修复的策略,但是这种方式会给系统负载带来大量不必要的浪费。通过对节点失效行为和副本数量的分析,提出了基于平均偏移的两阶段数据修复策略。实验证明,该策略在保证系统副本可用性的前提下,有效地降低了数据修复过程对系统的负载压力,提高了集群系统的系统稳定性。
Large scale distributed storage system provide data availability and reliability by means of a given level of redundancy. To assure data availability in case of node failures, the data stored on the failed node need to be recovered. However, since the unpredictability of node failures, deciding when to recover the data is difficult. At present, many systems adopt a reactive approach which tends to waste the system resources profusely. According to the analysis of the behaviors of node failures and the number of replicas, this paper presents a staged data recovery stratery base on average offset, and the experiment shows that in the case of availability. It reduces the workloads of the process of data recovery on the system effectively and enhances the stability of system.
出处
《互联网天地》
2013年第2期7-12,共6页
China Internet
基金
国家科技重大专项基金资助项目(No.2010ZX03004-001-02
No.2011ZX03002-003-02
No.2012ZX03002-004-004)
四川省战略性新兴产业发展促进项目(No.SC2011510703006)
关键词
数据修复
副本冗余度
节点失效
平均偏移
分布式存储
data recovery, replica redundancy, node failure, average offset, distributed storage