期刊文献+

基于可用性度量的分布式文件系统节点失效恢复算法 被引量:8

Node Failure Recovery Algorithm for Distributed File System Based on Measurement of Data Availability
下载PDF
导出
摘要 现有分布式文件系统中处理节点失效时采用的恢复策略耗费较多的带宽与磁盘空间资源,且影响系统的稳定性。通过研究分布式文件系统HDFS集群结构、数据块存储机制、节点与数据块状态之间的关系,定义了集群节点矩阵、节点状态矩阵、文件分块矩阵、数据块存储矩阵与数据块状态矩阵为度量数据块可用性建立了基础数据模型。在实现数据块可用性度量基础上,设计了基于可用性度量的节点失效恢复算法并分析了算法的性能。实验结果表明:新算法在保证系统中所有数据块可用性的前提下比原恢复策略减少了恢复所需带宽与磁盘资源,缩短了节点恢复时间,提高了系统稳定性。 The strategy for distributed file system dealing with node failure needs much bandwidth and disk space resources and affects stability of the system.By studying HDFS's cluster structure,data blocks storage mechanism,the state relationship between node and block,we defined the cluster nodes matrix,node status matrix,file block partition matrix,block storage matrix and block state matrix.Those definitions enable us to model the availability of data block easily.Based on the measurement of data block's availability,we proposed the new node failure recovery algorithm and analyzed the performance of the algorithm.The experimental results show that compared with the original strategy,the new algorithm ensures the availability of all blocks in the system and reduces the bandwidth and disk space resources for recovery,shorts the recovery time,and improvs the stability of system.
出处 《计算机科学》 CSCD 北大核心 2013年第1期144-149,共6页 Computer Science
基金 国家自然科学基金项目(60863003 61063042) 新疆维吾尔自治区自然科学基金项目(2011211A011)资助
关键词 云计算 分布式文件系统 失效恢复 可用性度量 Cloud computing Distributed file system Failure recovery Measurement of data availability
  • 相关文献

参考文献22

  • 1Ghemawat S,Gobioff H,Leung S T. The Google File System[A].New York,USA,2003.29-43.
  • 2Borthaku D. The Hadoop Distributed File System:Architecture and Design[OL].http..//hadoop.apache.org/common/docs/r0.18.2/hdfs_design.pdf,2007.
  • 3Chang F,Dean J,Ghemawat S,e t al. Bigtable:A Distributed Storage System for Structured Data[A].Seattle,Washington,USA,2006.205-218.
  • 4Dean J,Ghernawat S. MapReduce:Simplifed data processing on large clusters[A].San Francisco,CA,USA,2004.137-150.
  • 5Chen P M,Lee E K,Gibson G. RAID:High-performance,reliable secondary storage[J].ACM Computing Surveys,1994,(02):145-185.
  • 6Burkhard W A,Menon J. Disk array storage system reliability[A].1993.432-441.
  • 7Plank J S. A tutorial on Reed-Solomon coding for fault tolerance in RAID-like systems[J].Software Practice and Experience (SPE),1997,(09):995-1012.
  • 8Schwarz T J. Generalized Reed Solomon codes for erasure correction in SDDS[A].France:Paris,2002.
  • 9Rhea S,Eaton P,Geels D. Pond:the OceanStore prototype[A].2003.1-14.
  • 10Weatherspoon H,Kubiatowicz J. Erasure coding vs.replication:A quantitative comparison[A].Cambridge,Massachusetts,2002.

同被引文献196

  • 1GHEMAWAT S,GOBIOFF H,LEUNG S.The Google file system [C]// Proceedings of the 19th ACM Symposium on Operating System Principles.New York:ACM Press,2003:29-43.
  • 2DHRUBA B.The Hadoop distributed file system:architecture and design [EB/OL].(2007-07-01)[2014-01-20].http://hadoop.apache.org/common/docs/r0.18.2/hdfs_design.pdf.
  • 3CHANG F,DEAN J,GHEMAWAT S,et al.Bigtable:a distributed storage system for structured data [C]// OSDI'06:Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation.Berkeley:USENIX Association,2006:205-218.
  • 4DEAN J,GHEMAWAT S.MapReduce:simplified data processing on large clusters [C]// OSDI'04:Proceedings of the 6th Symposium on Operating System Design and Implementation.Berkeley:USENIX Association,2004:137-150.
  • 5LUDASCHER B,ALTINTAS I,BERKEY C,et al.Scientific workflow management and the Kepler system [J].Concurrency and Computation:Practice and Experience,2005,18(10):1039-1065.
  • 6OINN T,ADDIS M,FERRIS J,et al.Taverna:a tool for the composition and enactment of bioinformatics workflows [J].Bioinformatics,2004,20(17):3045-3054.
  • 7WIECZOREK M,PRODAN R,FAHRINGER T.Scheduling of scientific workflows in the ASKALON grid environment [J].SIGMOD Record,2005,34(3):56-62.
  • 8FEDAK G,HE H,CAPPELLO F.BitDew:a programmable environment for large-scale data management and distribution [C]// Proceedings of the 2008 ACM/IEEE Conference on Supercomputing.Piscataway:IEEE Press,2008:1-12.
  • 9YUAN D,YANG Y,LIU X,et al.A data placement strategy in scientific cloud workflows [J].Future Generation Computer Systems,2010,26(8):1200-1214.
  • 10MOTWANI R,WIDOM J,ARASU A,et al.Query processing,approximation,and resource management in a data stream management system [C]// Proceedings of the 1st Biennial Conference on Innovative Data Systems Research.Waltham:Morgan Kaufmann Publishers,2003:245-256.

引证文献8

二级引证文献88

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部