摘要
随着社会数字网络信息化进程的不断推进,全球IT企业需要管理的数据量急剧增长.当前大规模数据中心对海量复杂数据管理在扩展性、性能和成本等方面要求的不断提升.为了减缓企业存储容量的增长速度,传统的重复数据删除存储管理技术和方法已无法满足大数据备份应用的服务质量需求,新的软硬件技术进步为大数据管理能力的提升带来机遇.提出了一种面向大数据备份的应用感知并行重删存储系统,它利用新型非易失性存储来提升块索引的并发查询能力,并通过应用层丰富的文件语义信息设计应用感知的数据路由机制.通过实验论证,该并行重删存储系统不仅能实现单个节点内高性能的并行数据重删处理,还能通过横向扩展提升集群数据重删的吞吐量.
With the continuous advancement of the informatization process in social digital network,the volume of data needs to be managed by the global IT enterprises is growing rapidly.The requirements of the massive complex data management are constantly enhanced in terms of scalability,performance and cost in the storage systems.To slow down the growth rate of storage capacity in enterprises,conventional deduplication based storage management techniques and methods cannot satisfy the QoS requirements of big data backup,while the progress of new software and hardware technologies brings opportunities to promote the ability of big data management.We provide an application aware parallel deduplication storage system for big data backup.It utilizes the novel nonvolatile storage to explore the concurrent query ability of chunk index structure,and an applicationaware data routing scheme is designed by leveraging file semantic informtion in the application layer.Our experiment results show that the proposed storage system can not only achieve high performance in parallel deduplication process,but also can improve the system throughput of cluster deduplication.
出处
《计算机研究与发展》
EI
CSCD
北大核心
2015年第S2期139-147,共9页
Journal of Computer Research and Development
基金
国家自然科学基金项目(61402518)
国家"八六三"高技术研究发展计划基金项目(2012AA01A509
2012AA01A510)
关键词
大数据备份
并行重删
应用感知
非易失存储
扩展性
big data backup
parallel deduplication
application awareness
non-volatile storage
scalability