期刊文献+

云存储中基于相似性的客户-服务端双端数据去重方法 被引量:2

Similarity-Based Client-Server Data Deduplication Method in Cloud Storage
下载PDF
导出
摘要 数据去重是云存储系统提高存储率的主要手段之一,为解决查重过程中因数据量大而导致的索引I/O瓶颈和数据块指纹冲突问题,从而提高查重效率和去重准确度,提出了一种快速且安全的数据去重方法。该方法采用客户-服务器端双重检测框架,基于滑动窗口技术和Rabin指纹算法并根据文件内容确定数据块边界,摒弃传统的MD5和SHA1算法,应用第3代安全哈希函数SHA3来计算数据块的指纹。提出两级索引策略,基于数据相似性原理来加快索引查找和比对速度。试验结果表明,客户-服务器双重检测框架能有效提高系统查重效率,基于SHA3的数据块指纹计算更加精确,能有效提高去重准确度。 Data deduplication is one of the main ways to improve the storage efficiency of cloud storage system. In order to solve the problems of the I/O bottleneck of index and the fingerprint conflict of data blocks brought by large-scale data, a fast and secure data deduplication method is proposed to find duplicates more quickly and delete duplicates more accurate. In the method, a dual detection framework distributed in client and server sides is put forward, and Rabin fingerprint algorithm based on slicing window technique is adopted to determine the data block boundaries, and SHA3 is used for computing the fingerprint of the data block by abandoning the traditional MD5 and SHA1 algorithms. Furthermore, a two-level indexing strategy is proposed, which can speed up the index searching and matching speed based on data similarity. The experimental results show that this method can find duplicates efficiently by the dual detection framework, and delete duplicates accurately by data block fingerprint calculation based on SHA3.
作者 燕彩蓉 钱凯
出处 《东华大学学报(自然科学版)》 CAS CSCD 北大核心 2018年第1期115-122,共8页 Journal of Donghua University(Natural Science)
基金 国家自然科学基金资助项目(61402100) 中央高校基本科研业务费专项资金资助项目(16D111210)
关键词 云存储 数据去重 滑动窗口技术 数据指纹 cloud storage data deduplication sliding window technique data fingerprint
  • 相关文献

同被引文献13

引证文献2

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部