基于分级编码的高可靠存储策略

A high reliable storage architecture with hierarchical coding

下载PDF

导出

摘要研究了适应当前大数据时代的数据可靠性存储,针对已有存储策略难以同时满足高可靠性存储和高空间利用的需求的问题,提出了一种面向大数据的高可靠低冗余分级编码存储策略。该策略考虑到数据因类型不同、生命周期不同而重要程度有别的特性,可为不同类型数据分别设定容错级别;将不同冗余度的容错编码方式在一套统一存储架构中实现,用一组简单参数设置为数据选择恰当的容错级别编码存储;通过动态降低历史数据的冗余度进一步减少存储空间开销。实验验证了其有效性。对重要小文件采用高容错级别的编码分片存储,能在系统95%存储节点失效的情况下,根据编码后的部分数据分片快速修复所有数据;对普通文件采用适当放松的容错编码级别,在保证数据快速、无损修复的前提下比传统3副本策略节省1.5倍的存储空间。 A study of high celiable data storage in the current big data age was conducted, and a novel hierarchical storage strategy for high reliable, low redundant storage of big data was proposed to solve the contradiction between the high reliability and the low storage utilization, facing the traditional storage strategies such as the multi-replication and the unified coding. To satisfy the diverse requirements of reliability for different storage objects, this strategy uses a unique architecture to provide variety of encoding methods for fault-tolerance. By setting the higher fault tolerance level for small text files and the lower fault tolerance level for large media files, the proposed strategy can bring the space overhead down from 200% to 50% compared with the triplication strategy. In addition, the small files will be recoverable even if 95% of storage node failures.

作者冯清青孟丹韩冀中

机构地区中国科学院计算技术研究所计算机应用研究中心中国科学院大学中国科学院信息工程研究所

出处《高技术通讯》 CAS CSCD 北大核心 2013年第11期1103-1109,共7页 Chinese High Technology Letters

基金 863计划(2012AA01100 2012AA01A401) 国家自然科学基金(61070028 61003063) 中国科学院先导专项(XDA06030200)资助项目

关键词大数据存储可靠性容错低冗余分级编码 big data, storage, reliability, fault tolerance, low redundancy, hierarchical, coding

分类号 TP333 [自动化与计算机技术—计算机系统结构]

引文网络
相关文献

参考文献19

1Shvachko K, Kuang H, Radia S, et al. The hadoop distrib- uted file system. In:2010 IEEE 26th Symposium on Mass Storage Systems and Technologies( MSST), Incline Villi- age,USA,2010.1-10.
2Mcgaughey K. Worl' data more than doubling every two years-driving big data opportunity, new IT roles, http :// www. emc. corn/about/news/press/2011/20110628-01. htm,2011.
3Ghemawat S, Gobioff H, Leung S-T. The Google file sys- tem. In:Proceedings of the 19th ACM symposium on Op- erating systems principles, Bolton Landing, USA, 2003. 29-43.
4Amazon C. Amazon simple storage service ( Amazon S3 ). http ://aws. amazon, com/s3/,2012.
5Rodrigues R, Liskov B. High Availability in DHTs:Era- sure Coding vs. Replication. Springer Berlin Heidelberg: Springer Science + Business Media,2005. 3640:226-239.
6Chen P M, Lee E K, Gibson G A, et al. RAID: high-per-formance, reliable secondary storage. ACM Computing Sur- veys(CSUR) ,1994,26(2) :145-185.
7Stephenson D J. RAID architecture with two-drive fault tolerance. US patent :6353895 [ P] ,2002-3-5.
8Jin C,Jiang H,Feng D,et al. P-Code:a new RAID-6 code with optimal properties. In:Proceedings of the 23rd inter- national conference on Supercomputing, Yorktown Heights, USA ,2009. 360-369.
9Calder B, Wang J, Ogus A, et al. Windows azure storage : a highly available cloud storage service with strong consis- tency. In: Proceedings of the 23rd ACM Symposium on Operating Systems Principles, Cascais, Portugal, 2011. 143-157.
10Rao J,Shekita E J, Tata S. Using Paxos to build a scala- ble, consistent, and highly available datastore. In : Proceed- ings of the VLDB Endowment, Seattle, USA, 2011. 243- 254.

1梅伶.“中国制造2025”背景下机械制造课程群建设[J].科技视界,2016(13):73-74. 被引量：3
2张洋,张楠,尹宝才.多描述编码研究现状[J].计算机学报,2007,30(9):1612-1624. 被引量：13
3马英瑞,熊焰.一种基于主动网络的分层多播的拥塞控制[J].微型机与应用,2004,23(2):45-45. 被引量：1
4智绪龙,张冬梅.无线视频传输系统在消防灭火中的应用研究[J].科技信息,2010(35). 被引量：1
5封颖,吴成柯.SVC空域增强层快速运动估计算法[J].西安电子科技大学学报,2007,34(5):697-701.
6靳国英,陈一民,高飞.基于移动IP的异种无线网间的切换——多媒体数据在异种网络的传输[J].计算机应用与软件,2008,25(1):198-200. 被引量：1
7章勇,肖炳甲.实时数据分片存储模式的研究[J].计算机测量与控制,2013,21(12):3346-3348.
8闵文凯.基于对象和内容的视频编码技术—MPEG-4[J].中国有线电视,2004(18):26-28.
9陈慧英.基于云平台NoSQL的海量天文图像存储研究[J].计算机与网络,2014,40(15):60-63. 被引量：1
10陈宇拓,李建红,杨炫,韩旭里,余英林.基于RGB色彩模型相关模板的彩色人头图像编码[J].计算机应用与软件,2010,27(5):40-44. 被引量：1

高技术通讯

2013年第11期

浏览历史

内容加载中请稍等...

基于分级编码的高可靠存储策略

参考文献19

相关作者

相关机构

相关主题

浏览历史