期刊文献+

Rabin指纹算法在重复数据检测中的应用研究

The Research on Rabin Fingerprint Algorithm Apply to Duplicate Data Detection
下载PDF
导出
摘要 Rabin指纹算法计算效率高、随机性好,可将数据更改对连续指纹序列的影响限制在局部范围内,广泛应用于重复数据检测领域。分析了Rabin指纹在有限域GF(2n)上的运算原理,得出滑动窗口移动时定长字符序列的数字指纹快速计算公式。用伪代码描述了Rabin指纹算法在重复数据检测中的应用,并用VC++语言进行了算法实现,在普通计算机上提取Word文档、程序源代码和BMP图像等三类文件作为测试数据集,测试结果表明算法是有效的。 Rabin fingerprint algorithm is widely used in the field of duplicate data detection with high computational efficiency and good randomness.When data changes affecting the continuous fingerprint sequence,the algorithm can limit the impact to lo cal area.After analyzing the Rabin fingerprint principle on galois field GF(2n),a fingerprint fast calculation formula for fixedlength character sequence is derived in the process of slide window moving.The application of Rabin fingerprint algorithm in du plicate data detection fields is described by pseudo code,and implemented by VC++ programing language.Experiment uses a da ta set that including three types of files(Word documents,source code,and BMP images) extraced from some ordinary computer,and the result shows that the algorithm is effective.
出处 《电脑知识与技术》 2013年第7X期4918-4920,4932,共4页 Computer Knowledge and Technology
基金 广东远程开放教育科研基金项目(YJ1333) 韶关市创新资金项目(201210) 韶关学院科研项目(201202)
关键词 存储系统 重复数据检测 Rabin指纹 基于内容分块 有限域 storage system duplicated data detection rabin fingerprint content defined chunking galois field
  • 相关文献

参考文献11

二级参考文献86

  • 1杨天奇,周晔.一种增量式并行Web信息采集方法[J].计算机工程,2006,32(20):97-99. 被引量:5
  • 2蒋宗礼,赵钦,肖华,王蕊.高性能并行爬行器[J].计算机工程与设计,2006,27(24):4762-4766. 被引量:7
  • 3Yang Tianming,Dan Feng.FBBM:A new backup method with data de-duplication capability [C]. International Conference on Multimedia and Ubiquitous Engineering,2008:30-35.
  • 4Yang Tianming, Dan Feng.3DNBS:A data deduplication diskbased network backup system[C].IEEE International Confere- nce on Networking,Architecture,and Storage,2009:287-294.
  • 5Youjip Won, Rakie Kim, Jongmyeong Ban. PRUN: Eliminating information redundancy for large scale data backup system[C]. International Conference on Computational Sciences and Its Applications,2008:139-144.
  • 6Bobbarjung DR,Jagannathan S,Dubnicki C.Improving duplicate elimination in storage systems[J].ACM Trans on Storage,2006,2 (4):424-448.
  • 7Bhagwat D,Pollack K,Long DDE.Providing high reliability in a minimum redundancy archival storage system[C].IEEE Intemational Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems,2006:413-421.
  • 8You LL,Po|lack KT, Long DDE.Deep store:An archival storage system architecture[C].International Conference on Data Engineering,2005:804-815.
  • 9Benjamin Zhu,Kai Li.Avoiding the disk bottleneck in the data domain deduplication file system [C]. Proceedings of the sixth USENIX Conference on File and Storage Technologies,2008:1-4.
  • 10Jain N,Dahlin M,Tewari R.Taper:Tiered approach for elimina- ting redundancy in replica synchronization [C]. 4th USENIX Conference on File and Storage Technologies,2005:281-294.

共引文献76

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部