期刊文献+

基于分级编码的高可靠存储策略

A high reliable storage architecture with hierarchical coding
下载PDF
导出
摘要 研究了适应当前大数据时代的数据可靠性存储,针对已有存储策略难以同时满足高可靠性存储和高空间利用的需求的问题,提出了一种面向大数据的高可靠低冗余分级编码存储策略。该策略考虑到数据因类型不同、生命周期不同而重要程度有别的特性,可为不同类型数据分别设定容错级别;将不同冗余度的容错编码方式在一套统一存储架构中实现,用一组简单参数设置为数据选择恰当的容错级别编码存储;通过动态降低历史数据的冗余度进一步减少存储空间开销。实验验证了其有效性。对重要小文件采用高容错级别的编码分片存储,能在系统95%存储节点失效的情况下,根据编码后的部分数据分片快速修复所有数据;对普通文件采用适当放松的容错编码级别,在保证数据快速、无损修复的前提下比传统3副本策略节省1.5倍的存储空间。 A study of high celiable data storage in the current big data age was conducted, and a novel hierarchical storage strategy for high reliable, low redundant storage of big data was proposed to solve the contradiction between the high reliability and the low storage utilization, facing the traditional storage strategies such as the multi-replication and the unified coding. To satisfy the diverse requirements of reliability for different storage objects, this strategy uses a unique architecture to provide variety of encoding methods for fault-tolerance. By setting the higher fault tolerance level for small text files and the lower fault tolerance level for large media files, the proposed strategy can bring the space overhead down from 200% to 50% compared with the triplication strategy. In addition, the small files will be recoverable even if 95% of storage node failures.
出处 《高技术通讯》 CAS CSCD 北大核心 2013年第11期1103-1109,共7页 Chinese High Technology Letters
基金 863计划(2012AA01100 2012AA01A401) 国家自然科学基金(61070028 61003063) 中国科学院先导专项(XDA06030200)资助项目
关键词 大数据 存储 可靠性 容错 低冗余 分级 编码 big data, storage, reliability, fault tolerance, low redundancy, hierarchical, coding
  • 相关文献

参考文献19

  • 1Shvachko K, Kuang H, Radia S, et al. The hadoop distrib- uted file system. In:2010 IEEE 26th Symposium on Mass Storage Systems and Technologies( MSST), Incline Villi- age,USA,2010.1-10.
  • 2Mcgaughey K. Worl' data more than doubling every two years-driving big data opportunity, new IT roles, http :// www. emc. corn/about/news/press/2011/20110628-01. htm,2011.
  • 3Ghemawat S, Gobioff H, Leung S-T. The Google file sys- tem. In:Proceedings of the 19th ACM symposium on Op- erating systems principles, Bolton Landing, USA, 2003. 29-43.
  • 4Amazon C. Amazon simple storage service ( Amazon S3 ). http ://aws. amazon, com/s3/,2012.
  • 5Rodrigues R, Liskov B. High Availability in DHTs:Era- sure Coding vs. Replication. Springer Berlin Heidelberg: Springer Science + Business Media,2005. 3640:226-239.
  • 6Chen P M, Lee E K, Gibson G A, et al. RAID: high-per-formance, reliable secondary storage. ACM Computing Sur- veys(CSUR) ,1994,26(2) :145-185.
  • 7Stephenson D J. RAID architecture with two-drive fault tolerance. US patent :6353895 [ P] ,2002-3-5.
  • 8Jin C,Jiang H,Feng D,et al. P-Code:a new RAID-6 code with optimal properties. In:Proceedings of the 23rd inter- national conference on Supercomputing, Yorktown Heights, USA ,2009. 360-369.
  • 9Calder B, Wang J, Ogus A, et al. Windows azure storage : a highly available cloud storage service with strong consis- tency. In: Proceedings of the 23rd ACM Symposium on Operating Systems Principles, Cascais, Portugal, 2011. 143-157.
  • 10Rao J,Shekita E J, Tata S. Using Paxos to build a scala- ble, consistent, and highly available datastore. In : Proceed- ings of the VLDB Endowment, Seattle, USA, 2011. 243- 254.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部