期刊文献+

一种基于Winnowing分块的重复数据删除备份与恢复系统设计与实现 被引量:3

Design andimplementation of winnowing based deduplication data backup and recovery system
原文传递
导出
摘要 针对海量数据中存在的大量冗余信息,本文设计并实现了一种基于重复数据删除的文件备份与恢复系统,该系统采用改进的Winnowing动态分块算法,将文件分割成不同长度的数据块,并结合摘要算法、索引表、数据压缩等技术,确保服务器仅存储数据块唯一副本,以达到重复数据的删除目的.实验表明,该系统相比cwRsync能较更好的减少网络流量,并且相比传统的压缩技术能更进一步减少磁盘空间占用率. Aimed at much redundant information of mass data, the authors design and carry out a file backup and recovery system in this article. For achieving the goal of data deduplication, this system applying improved winnowing splits files into blocks with variable length, and combines digest algorithm, index table and compression technology to insure that there is only one copy of every data block saved on the server. The experimental results show that our system can reduce more network traffic than cwRsync. Furthermore, compared with traditional compression techniques, the system has lower disk space occupancy rate.
出处 《四川大学学报(自然科学版)》 CAS CSCD 北大核心 2012年第3期535-542,共8页 Journal of Sichuan University(Natural Science Edition)
基金 国家自然科学基金(61173159) 教育部创新工程重大项目培育(708075)
关键词 WINNOWING 重复数据删除 文件备份与恢复 winnowing, data deduplication, File backup and recovery
  • 相关文献

参考文献12

  • 1McKnight J, Asaro T, Babineau B. Digital archiv- ing: End-User survey and market forecast 2006-2010 [-EB/OL]. (2006-02-34). [-2011-11-23], http:// www. enterprisestrategygroup, com/2006/03/digital- arehiving-end-user-survey-market-forecast-2006- 2010/.
  • 2Bolosky W J, Scott Corbin, David Goebel, et al. Single Instance Storage in Windows 20001-C]//Pro- ceedings of the 4th Usenix Windows System Sympo- sium. CA, USA: USENIX Association Berkeley, 2000.
  • 3Quinlan S, Dorward S. Venti: a new approach to ar- chival storage[C]//Proceedings of the Conference on File and Storage Technologies (FAST' 02). CA, USA .USENIX Association Berkeley, 2002.
  • 4Cox L P, Murray C D, Noble B D. Pastiche: Making backup cheap and easy[C]//Proceedings of the 5th USEN IX Symposium on Operating Systems Design and Implementation. Boston, USA: [s. n. ], 2002.
  • 5Schleimer S, Wilkerson D S, Aiken A. Winnowing: Local algorithms for document fingerprinting [C]// SIGMOD Conference. New York, NY, USA: ACM,2003.
  • 6Eastlake E, Jones P. US secure hash algorithm 1 (SHA1),RFC 3174[S]. Is. 1. ]: Network Working Group,2001.
  • 7Deutsch P, Gailly J L. Zlib compressed data format specification version 3. 3, RFC 1950 [S]. [S. 1. ]: Network Working Group, 1996.
  • 8Zhu B, Li K, Patterson H. Avoiding the disk bottle- neck in the data domain deduplieation file system [C]//Proeeedings of the 6th USENIX Conference on File and Storage Technologies. CA, USA : USENIX Association Berkeley, 2008.
  • 9Tridgell A, Mackerras P. The rsync algorithm, CS- 96-05 [R]. Canberra,Australia, The Australian Na- tional University, 1996.
  • 10任欣,李涛,胡晓勤.远程文件备份与恢复系统的设计与实现[J].计算机工程,2009,35(10):112-114. 被引量:16

二级参考文献57

  • 1王琨,袁峰,周利华.灾难恢复系统模型研究[J].网络安全技术与应用,2006(3):10-13. 被引量:5
  • 2Tridgell A. Efficient Algorithms for Sorting and Synchronization[D]. Canberra, Australia: The Australian National University, 1999.
  • 3Tridgell A, Mackerras P. The Rsync Algorithm[R]. Canberra, Australia: The Australian National University, Tech. Rep.: CS-96-05, 1996.
  • 4Lewis S, PhD and Systems Audit Group Inc.. Disaster Recovery Yellow Pages[M]. [S. l.]: New Releases and Special Offers, 2003: 189-223.
  • 5Lennert J F, Retzner W, Monica G. et al. The Automated Backup Solution Safeguarding the Communications Network Infrastru- cture[J]. Bell Labs Technical Journal, 2004, 9(1): 59-84.
  • 6Chen Yan, Qu Zhiwei, Zhang Zhenhua, et al. Data Redundancy and Compression Methods for a Disk-based Network[C]//Proc. of ITCC'04. Washington D. C., USA: IEEE Computer Society, 2004.
  • 7Bhagwat D,Pollack K,Long DDE,Schwarz T,Miller EL,P-ris JF.Providing high reliability in a minimum redundancy archival storage system.In:Proc.of the 14th Int'l Symp.on Modeling,Analysis,and Simulation of Computer and Telecommunication Systems (MASCOTS 2006).Washington:IEEE Computer Society Press,2006.413-421.
  • 8Zhu B,Li K.Avoiding the disk bottleneck in the data domain deduplication file system.In:Proc.of the 6th Usenix Conf.on File and Storage Technologies (FAST 2008).Berkeley:USENIX Association,2008.269-282.
  • 9Bhagwat D,Eshghi K,Mehra P.Content-Based document routing and index partitioning for scalable similarity-based searches in a large corpus.In:Berkhin P,Caruana R,Wu XD,Gaffney S,eds.Proc.of the 13th ACM SIGKDD Int'l Conf.on Knowledge Discovery and Data Mining (KDD 2007).New York:ACM Press,2007.105-112.
  • 10You LL,Pollack KT,Long DDE.Deep store:An archival storage system architecture.In:Proc.of the 21st Int'l Conf.on Data Engineering (ICDE 2005).Washington:IEEE Computer Society Press,2005.804-815.

共引文献138

同被引文献27

  • 1彭勇,刘晓洁,邓洪敏,胡晓勤,何宇平,李涛.基于差异的远程文件备份与恢复方法[J].四川大学学报(自然科学版),2009,46(2):348-352. 被引量:7
  • 2熊建刚,冯丹.高可用的磁盘阵列Cache的设计和实现[J].计算机工程与科学,2006,28(8):119-121. 被引量:5
  • 3Li Q, Xu H L. Research on the backup mechanism of oracle database[C]. International Conference on En- vironmental Science and Information Application Technology. Wuhan, China. IEEE Computer Socie- ty,2009 : 423.
  • 4Kalen Delaney, Paul S. Randal, Kimberly L. Tripp,et al深入解析SQLServer2008[M].陈宝国,李光杰,薛赛男,等,译.北京:人民邮电出版社,2010:97,145.
  • 5SunGZ, YuD, ChenDW, etal. Data backup and recovery based on data de-duplication[C]. Proceed- ings-International Conference on Artificial Intelli- gence and Computational Intelligence, AICI 2010. Sanya, China : IEEE Computer Society, 2010.
  • 6Zahed K S, Rani P S, Saradhi U V, etal. Reducing storage requirements of snapshot backups based on rsync utility[-C-]. 1st International Conference on Communication Systems and Networks and Work- shops. Bangalore, India: IEEE Computer Society, 2009 : i.
  • 7麻会东,刘国华,李旭,梁鹏,刘春辉,张凌宇.基于提取关键词的中文文档复制检测研究[J].计算机工程与科学,2007,29(10):63-64. 被引量:6
  • 8Peacock J K. The counterpoint fast file system [C]// Proceedings of the USENIX Winter Confer- ence. Dallas, Texas, USA: USENIX Association, 1988.
  • 9McVoy L W, Kleiman S R. Extent-like performance from a UNIX file system[C]// Proceedings of the USENIX Winter Conference. Dallas, Texas, USA: USENIX Association, 1991.
  • 10Sweeney A, Doucette D, Hu W, et al. Scalability in the XFS file system[C]// Proceedings of the USENIX 1996 Annual Technical Conference. San Diego, CA: USENIX Association, 1996.

引证文献3

二级引证文献11

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部