期刊文献+

基于Redis的海量小文件分布式存储方法研究 被引量:22

Research of Redis-based distributed storage method for massive small files
下载PDF
导出
摘要 小文件作为信息传输、存储的重要方式,使用相当广泛,用户对其可靠性和速度的要求也在不断提高。针对目前小文件存储效率较低的问题,首先结合分布式存储系统HDFS的大文件存储优势和Redis缓存技术,提出快速合并小文件的存储方案。把小文件合并为Sequence File存储到HDFS上,采用多元线性回归分析确定负载系数进行负载均衡调节,并在获取文件时使用缓存保证效率。在实验上,搭建相应的文件平台,分别对上传、获取、删除以及内存占用和传统直接上传的方式进行对比分析。可以看出,与传统的直接上传文件到HDFS的方式相比,经过改进的小文件处理方式可以在保证文件可靠性的同时,更快速地处理小文件。 As an important way of information transmission and storage,small file has been widely used in many fields.Meanwhile,its reliability and speed requirements need to be improved.For the inefficiency of small file storage,combining the advantage of big file storage of distributed storage system HDFS and the Redis cache technology,we propose a fast small file merging scheme.Small files are merged to Sequence File,which is then stored in HDFS.Loads are balanced by load coefficients that are determined by multiple linear regression analysis,and the efficiency of file access is guaranteed by cache.In experiments,the corresponding file platform is constructed to analyze and compare upload,access,delete,and memory footprint with the traditional direct upload.We can see that,compared with the traditional way of uploading files to HDFS,the improved small files treatment can ensure the reliability of files and enables users operations on small files faster.
出处 《计算机工程与科学》 CSCD 北大核心 2013年第10期58-64,共7页 Computer Engineering & Science
基金 国家科技部支撑计划课题基金(2012BAH04F01) 科技创新平台(PXM2013_014212_000011)
关键词 HDFS 小文件 文件缓存 分布式文件系统 HDFS small file file cache distributed file system
  • 相关文献

参考文献7

二级参考文献49

  • 1虞云翔.嵌入式Linux系统中Overlay文件系统的实现[J].微电子学与计算机,2005,22(10):175-178. 被引量:3
  • 2Agrawal R, Strikant R. Fast Algorithms for Mining Association Rules[C]//Proc. of the 20th Int'l Conf. on Very Large Database. San Jose, CA, USA:[s. n.], J994: 212-217.
  • 3CacheFlow Inc.. Active Web Caching Technology[Z]. 2000.
  • 4Williams S, Abrams M, Standridge C, et al. Fox Removal Policies in Network Caches for World Wide Web Documents[C]//Proceedings of the ACM SIGCOMM'96. New York, USA: [s. n.], 1996.
  • 5Cao Weihong. Cost-aware WWW Proxy Caching Algorithms[C]// Proceedings of the 1997 USENIX Symposium on Internet Technology and Systems. New York, USA: [s. n.], 1997.
  • 6HADOOP Wi-ki[EB/OL].[2009-07-01].http://wiki.apache.org/hadoop/.
  • 7GHEMAWAT S,GOBIOFF H,LEUNG S T.The google file system.[EB/OL].[2009-07-01].http://labs.google.com/papers/gfs.html.
  • 8DEAN Jean,GHEMAWAT S.Map/reduce:simplified data processing on large clusters[EB/OL].[2009-07-01].http://static.googleusercontent.com/external _ content/untrusted _ dlcp/labs.google.com/zh-CN//papers/mapreduce-osdi04.pdf.
  • 9Map/Reduce[EB/OL].[2009 -07 -01].http://wiki.apache.org/hadoop/HadoopMapReduce.
  • 10HDFS[EB/OL].[2009-07-10].http://wiki.apache.org/hadoop/ProjectDescription.

共引文献132

同被引文献174

引证文献22

二级引证文献126

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部