期刊文献+

海量图片文件存储去重技术研究 被引量:6

RESEARCH ON DEDUPLICATION TECHNOLOGY FOR MASSIVE IMAGE FILE STORAGE
下载PDF
导出
摘要 提出一种基于分布式数据库与分布式文件系统相结合的海量图片文件存储去重技术。该技术通过提取图片文件二进制流的特征段计算文件MD5码签名,依据签名对图片文件进行存储去重。结合实验数据分析验证该技术不仅能够准确地去重图片,有较高的删除率,且经对比得到该技术在计算签名时间、上传速度等方面均优于文件级去重和块级去重技术,是对海量图片数据存储的一种优化。同时针对该技术的不足提出了改进方案。 In this paper we present a deduplication technology for massive image files storage. This technology,which is based on the combination of distributed database and distributed file system,calculates file's of MD5 signature by extracting the feature segment of binary stream of image files,and deduplicates the storage in regard to image files according to the signature. It has been analysed and verified in combination with the experimental data that this technology is accurate in deduplicating images,besides,it has a high deletion rate. What's more,compared with file-level deduplication and block-level deduplication technology,this technology is superior in calculating the time of signature and uploading speed,and offers an optimisation to massive image files storage. Meanwhile,we also put forward in this paper an improved scheme aiming at the deficiency of this technology.
出处 《计算机应用与软件》 CSCD 北大核心 2014年第4期56-58,共3页 Computer Applications and Software
基金 国家自然科学基金项目(61272391)
关键词 图片文件 去重 分布式 MD5 Image file Deduplication Distributed MD5
  • 相关文献

参考文献9

  • 1Frank B Schmuck,Roger L Haskin. GPFS:A Shared-Disk File System tbr Large Computing Clusters [ J ]. Proceedings of the Conference on File and Storage Technologies ,2002,28 (1) :231 -244.
  • 2Konstantin Shvachko, Hairong Kuang, Sanjay Radia, et al. The Hadoop Distributed File System[ J]. Proceedings of the 2010 IEEE 26th Sym- posium on Mass Storage Systems and Technologies ( MSST), 2010,3 (4):1-10.
  • 3Zhu B, Li K, Patterson H. Avoiding the Disk Bottleneck in the Data Do- main Deduplication File System[ C ]//Proceedings of the 6th USENIX Conference on File and Storage Technologics. San Jose, CA, USA, 2008.
  • 4Muthitacharoen, Chen B, Mazieres D. A Low-bandwidth Network File System Proceedings of the eighteenth ACM symposium on Operating systems principles [ M ]. Bandffm, Alberta, Canada,2001 : 174 - 187.
  • 5Broder, Mitzenmacher M. Network Applicatinos of Bloom Filters [ J ]. A Survy, Internet Mathematics,2005,1 (4) :485 - 509.
  • 6Rivest R. The MD5 message-digest algorithm[ J]. RFC 1321, Internet Engineering Task Force,1992,22( 1 ) :15 -26.
  • 7陈志刚,李登,曾志文.分布式系统中一种动态负载均衡策略、相关模型及算法研究[J].小型微型计算机系统,2002,23(12):1434-1437. 被引量:19
  • 8Androutsellis-Theotokis S, Spinellis D. A survey Of peer-to-peer content istribution technologies[ J ]. ACM Comput-ing Surveys, 2004,36 (4) : 335 -371.
  • 9Chervenak A, VeBanki V, Kurmas Z. Protecting file systems : A survey of backup techniques[ J]. Proceeding Joint NASA and IEEE Mass Stor- age Conference, 1998,22 ( 1 ) :898 - 911.

二级参考文献2

共引文献18

同被引文献50

引证文献6

二级引证文献20

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部