期刊文献+

云文件系统中纠删码技术的研究与实现 被引量:9

Research and Implementation on Erasure Code in Cloud File System
下载PDF
导出
摘要 云文件系统凭借高性能、高扩展、高可用、易管理等特点,成为云存储和大数据的基础和核心。云文件系统一般采用完全副本技术来提升容错能力,提高数据资源的使用效率和系统性能。但完全副本的存储开销随着副本数目的增加呈线性增长,存储副本时造成额外的写带宽和数据管理开销。纠删码在没有增加过量的存储空间的基础上,通过合理的冗余编码来保证数据的高可靠性和可用性。研究了纠删码技术在云文件系统中的应用,从纠删码类型、编码对象、编码时机、数据更改、数据访问方式和数据访问性能等六个方面,对云文件系统中纠删码的设计进行了探究,以增强云文件系统的存储模型。在此基础上,设计并实现了纠删码原型系统,并通过实验证明了纠删码能有效地保障云文件系统的数据可用性,并且节省存储空间。 Cloud file system has become the core and foundation of cloud storage and large data for its high performance, high scalability and high availability. Cloud file system generally uses replication technique to enhance its fault tolerance, improve efficiency in the use of data resources and system performance. However, the storage overhead of replication grows linearly with the number of replicas. And the replication costs extra writing bandwidth and management overhead. Erasure codes with reasonable redundancy coding can qualify high data reliability and availability without adding excessive amounts of storage space. This paper studies the technologies of erasure codes for applications of cloud file system, including erasure codes type, coding object, coding time, data modification, data access method and data access performance. Then, it discusses the challenge and tradeoff to design erasure codes for cloud file system. Finally, it designs and implements an erasure coding prototype for cloud file system. The experiments show that erasure codes in cloud file system can effectively protect the data availability of cloud file system, and save storage space.
出处 《计算机科学与探索》 CSCD 2013年第4期315-325,共11页 Journal of Frontiers of Computer Science and Technology
基金 国家自然科学基金No.61133004 国家高技术研究发展计划(863)No.2011AA01A203 科技部国际合作项目No.2009DFA12110~~
关键词 云文件系统 纠删码技术 云计算 冗余 cloud computing cloud file system redundancy erasure code
  • 相关文献

参考文献14

  • 1Plank J S, Simmerman S, Schuman C D. Jerasure: a library in C/C++ facilitating erasure coding for storage applications- Version 1.2, UT-CS-08-627[R]. Department of Computer Science, University of Tennessee, 2008.
  • 2Zhu Yunfeng, Lee P P C, Hu Yuchong, et al. On the speedup of single-disk failure recovery in XOR-coded storage sys- tems: theory and practice[C]//Proceedings of the 28th IEEE Conference on Massive Storage Systems and Technologies (MSST '12), Monterey, CA, Apr 2012. Washington, DC, USA: IEEE Computer Society, 2012: 1-12.
  • 3Weatherspoon H, Kubiatowicz J. Erasure coding vs. replica- tion: a quantitative comparison[C]//Proceedings of the 1st International Workshop on Peer-To-Peer Systems (IPTPS '01), Mar 2002. London, UK: Springer-Verlag, 2002: 328-338.
  • 4Shvachko K, Kuang H, Radia S, et al. The Hadoop distributed file system[C]//Proceedings of the 26th IEEE Symposium on Mass Storage Systems and Technologies (MSST '10). Washington, DC, USA: IEEE Computer Society, 2010: 1-10.
  • 5Borthakur D, Gray J, Sarma J S, et al. Apache Hadoop goes realtime at Facebook[C]//Proceedings of the 2011 ACM inter- national Conference on Management of Data (SIGMOD '11); Athens, Greece, 2011. New York, NY, USA: ACM, 2011.
  • 6Zhang Zhe, Deshpande A, Ma Xiaosong, et al. Does erasure coding have a role to play in my data center?, MSR-TR-2010- 52[R]. Microsoft Research, 2010.
  • 7Chen Yanpei, Ganapathi A, Griffith R, et al. The case for evaluating MapReduce performance using workload suites[C]// Proceedings of the 2011 IEEE 19th Annual International Symposium on Modeling, Analysis, and Simulation of Corn-puter and Telecommunication Systems (MASCOTS '11) Washington, DC, USA: IEEE Computer Society, 2011 390-399.
  • 8McKusick M K, Quinlan S. GFS: evolution on fast-forward[J]. Queue, 2009, 7: 10-20.
  • 9Plank J S, Luo Jianqiang, Schuman C D, et al. A performance evaluation and examination of open-source erasure coding libraries for storage[C]//Proceedings of the 7th Conference on File and Storage Technologies (FAST '09), San Francisco, CA, USA, 2009. Berkeley, CA, USA: USENIX Association, 2009: 253-265.
  • 10Fan Bin, Tantisiriroj W, Xiao Lin, et al. DiskReduce: RAID for data-intensive scalable computing[C]//Proceedings of the 4th Annual Workshop on Petascale Data Storage (PDSW '09), Portland, Oregon, 2009. New York, NY, USA: ACM, 2009: 6-10.

同被引文献49

  • 1王鹏,王新梅.LDPC码的快速编码研究[J].西安电子科技大学学报,2004,31(6):934-938. 被引量:22
  • 2http://hadoop.apache.org/docs/r2.0.4-alpha/hadoop-project-dist/hadoop-hd fs/HdfsDesign.html.
  • 3L.Rizzo,Effective erasure codes for reliable computer communication protocols,ACM Computer Communication,Review,1997,27(2).
  • 4WeatherspoonH,Kubiatowicz J.Erasure coding vs replication:quantitative comparison.In:Proc of the 1st Int'l Workshop Peer-to-Peer Systems,2002.
  • 5McKusick M K,Quinlan S.GFS:evolution onfast-forward[J].Queue,2009(7).
  • 6Dimakis A G,Godfrey P B,Wu Y,et al.Network coding for distributed storage systems[J].IEEE Transactions on Information Theory,2010,56 (9):4539-4551.
  • 7Dimakis A G,Ramchandran K,Wu Y,et al.A survey on network codes for distributed storage[J].Proceedings of the IEEE,2011,99 (3):476-489.
  • 8Fan B,Tantisiriroj W,Xiao L,et al.DiskReduce:RAID for data-intensive scalable computing[C]//Proceedings of the 4th Annual Workshop on Petascale Data Storage.ACM,2009:6-10.
  • 9Bonvin N,Papaioannou T G,Aberer K.The costs and limits of availability for replicated services[C]//ACDC,2009.
  • 10Khan O,Burns R,Plank J,et al.Rethinking erasure codes for cloud file systems:Minimizing I/O for recovery and degraded reads[C]//Proc of USENIX FAST,2012.

引证文献9

二级引证文献15

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部