期刊文献+

基于Cache的海量图片存取优化方案 被引量:1

High Concurrent Access Optimization in HDFS Based on Cache for Mass Images
下载PDF
导出
摘要 针对Hadoop分布式文件系统(Hadoop distributed file system,HDFS)存储海量图片效率低下的问题,在分析HDFS的基本框架以及其固有的文件读写流程基础上,提出了基于Cache的海量图片存储优化方案(HDFS based on Cache,CHDFS);该方案引入了Cache、预读、文件合并等机制,来提高图片读写的性能,弥补了HDFS存储海量图片时的缺陷;采用图片合并的方式减少Namenode中元数据的个数,同时提高Datanode存储空间的利用率;由于Cache、预读以及图片合并等操作对用户都是透明的,所以,该方案并没有增加用户使用HDFS的复杂性;实验结果表明,CHDFS方法可以有效地提高图片的存取效率。 To solve the problem of the low efficiency of HDFS (Hadoop Distributed File System) to store mass images, this paper studied the HDFS architecture and the flow of reading and writing files into HDFS, and then proposed an access optimization solution for mass pic- tures which is based on Cache. It is called CHDFS (HDFS Based on Cache). CHDFS adopts the following ideas to improve the performance, such as establishing appropriate cache , reading ahead pictures, merging more than one images into a big file and so on. File merge can de- crease the number of metadata in Namenode and improve the capacity factor of storage space in Datanode. To the client, this solution does not complex the operations to use the HDFS, due to the transparency of cache, read ahead and pictures merge. The experimental data indicates that CHDFS can increase the performance of storing and accessing mass pictures in HDFS without affecting the normal running of HDFS.
作者 陈渝
出处 《计算机测量与控制》 北大核心 2014年第8期2669-2672,2676,共5页 Computer Measurement &Control
基金 四川省教育厅科研项目(13ZA0135)
关键词 HDFS 海量图片 存储优化 CACHE HDFS mass pictures storage optimization Cache
  • 相关文献

参考文献2

二级参考文献33

  • 1Beaver D, Kumar S, Li H C, et al. Finding a needle in haystack: facebook's photo storage[ C ]. In Proceedings of the 9th USENIXSymposium on Operating System Design and Implementation ( OS- DI'10), Vancouver, Canada, October 2010.
  • 2Leslie Lamport. The part-time parliament [ J ]. ACM Transactions on Computer Systems, 1998,16 (2) : 133-169.
  • 3Fay Chang, Jeffrey Dean, Sanjay Ghemawat, et al. Bigtable: a distributed storage system for structured data [ C ]. In Proceedings of the 7th USENIX Symposium on Operating System Design and Implementation ( OSDI'06 ) ,2006.
  • 4Patil S, Gibson G A, Ganger G R, et al. In search of an API for scalable file systems: under the table or above it? [ C ]. In USE- NIX Workshop on Hot Topics in Cloud Computing (HotCloud 2009),2009.
  • 5Sage A Wei, Scott A Brandt, Ethan L Miller, et al. Ceph: a scala- ble, high-performance distributed file system [ C ]. In Proceedings of 7^th Symposium on Operating Systems Design and Implementation ( OSDI 2006 ) ,2006.
  • 6Swapnil Patil, Garth Gibson. Scale and concurrency of GIGA + : file system directories with millions of files[ C]. In Proceedings of the 9^th USENIX Conference on File and Storage Technologies ( FAST 2011 ) ,2011.
  • 7Philip Ross Cams, Sam Lang, Robert Ross Kunkel, Thomas Lud- wig. Small file access in parallel file systems [ C ]. In Proceedings of International Symposium on Parallel and Distributed Processing Systems ( IPDPS 2009) ,2009.
  • 8Michael Kuhn, Julian Kunkel, Thomas Ludwig. Directory-based metadata optimizations for small files in PVFS [ C ]. In Proceedings of the 14^th International Euro-Par Conference on Parallel Processing ( EuroPar 2008 ) ,2008.
  • 9Xing Jing, Xiong Jin, Sun Ning-hui, et al. Adaptive and scalable metadata management to support a trillion files[ C]. In Proceedings of the Conference on High Performance Computing Networking, Storage, and Analysis ( SC 2009 ) ,2009.
  • 10Marshall Kirk McKusick, Sean Quinlan. GFS: evolution on fast- forward [ J ]. ACM Queue, 2009,7 ( 7 ) : 10 -20.

共引文献24

同被引文献9

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部