期刊文献+

HDFS中高效存储小文件的方法 被引量:10

Efficient method for storing small files in HDFS
下载PDF
导出
摘要 为改善应用Hadoop分布式文件系统存储大量小文件时效率低下的问题,将NameNode职责分离,使用单独的NFS服务器同步存储元数据信息,以降低Client数据请求压力,提供大吞吐量数据访问并改善访问延迟;设计文件与数据块的对应模式,允许在同一块中存储多个小文件,并对系统加以实现,为海量小文件的存储提供了一个有效的解决方案。实验结果表明,该机制可以在数据迅速增长的背景下实现海量小文件的高效存取。 The HDFS is designed for the large file storage of the GB and the TB-level,which can not efficiently store large amounts of small files.By separating the NameNode duties and using a separate NFS server storing the metadata synchronization information,the data request pressure from the Client was reduced,the high throughput data access was provided and the access latency was improved.The corresponding modes for files and data blocks were designed that allowed multiple small files stored in the same block.The system was implemented,so as to provide an effective solution to the mass of small files storage.Experimental results show that this mechanism can realize the reliable massive small files access efficiently in a data rapidly growing background.
出处 《计算机工程与设计》 北大核心 2015年第2期406-409,共4页 Computer Engineering and Design
关键词 HADOOP分布式文件系统 海量小文件 性能优化 职责分离 合并小文件 Hadoop distributed file system massive amount of small files improving efficiency segregation of duties merge small files
  • 相关文献

参考文献11

  • 1陈旭文,黄英铭.海量视频点播系统的云计算技术与建模实现[J].现代电子技术,2013,36(14):10-12. 被引量:6
  • 2王铃惠,李小勇,张轶彬.海量小文件存储文件系统研究综述[J].计算机应用与软件,2012,29(8):106-109. 被引量:20
  • 3Dong Bo,Qiu Jie,Zheng Qinghua,et al.A novel approach to improving the efficiency of storing and accessing small fileson Hadoop:A case study by PowerPoint files[C] //IEEE International Conference on Services Computing,2010:65-72.
  • 4Rajeev Gupta,Himanshu Gupta,Ullas Nambiar,et al.Efficiently querying archived data using Hadoop[C] //19th ACM Conference on Information and Knowledge Management,2010:1301-1304.
  • 5Zhao Xiaoyong,Yang Yang,Sun Lili,et al.Metadata-aware small files storage architecture on Hadoop[C] //Web Information Systems and Mining,2012:136-143.
  • 6余思,桂小林,黄汝维,庄威.一种提高云存储中小文件存储效率的方案[J].西安交通大学学报,2011,45(6):59-63. 被引量:43
  • 7Liu Xuhui,Han Jizhong,Zhong Yunqin,et al.Implementing WebGIS on Hadoop:A case study of improving small file I/O performance on HDFS[C] //IEEE International Conference on Cluster Computing and Workshops,2009:429-436.
  • 8Bluesky’s integrated LiDAR imaging system[J] .Highways,2012,81(7):76.
  • 9邓鹏,李枚毅,何诚.Namenode单点故障解决方案研究[J].计算机工程,2012,38(21):40-44. 被引量:27
  • 10Oriani Andre,Garcia Islene C.From backup to hot standby:High availability for HDFS[C] //IEEE 31st Symposium on Reliable Distributed Systems,2012:131-140.

二级参考文献41

  • 1BORTHAKUR D. The hadoop distributed file system:architecture and design [EB/OL]. [2010-08- 25]. http://hadoop, apache, org/core/docs/current/ hdfs_desigru pdf.
  • 2MACKEY G, SEHRI S, WANG Jun. Improving metadata management for small files in HDFS [C/ OL.] // Proceedings of 2009 IEEE International Conference on Cluster Computing and Workshops. [2010- 08- 10]. http://ieeexplore, ieee. org/stamp/stamp. jsp? tp=&arnumber=5289133.
  • 3LIU Xuhui, HAN Jizhong, ZHONG Yunqin, et al. Implementing WebGIS on hadoop: a case study of im- proving small file I/O performance on HDFS [C/OL] //Proceedings of 2009 IEEE International Conference on Cluster Computing and Workshops. [2010-08-10]. http://ieeexplore, ieee. org/stamp/stamp, jsp? tp= &arnumber= 5289196.
  • 4DONG Bo, QIU Jie, ZHENG Qinghua, et al. A novel approach to improving the efficiency of storing and accessing small files on hadoop: a case study by PowerPoint files EC]ffProceedings of the 7th International Conference on Services Computing. Piscataway, NJ, USA: IEEE, 2010: 65-72.
  • 5HUANG Ruwei, YU Si, ZHUANG Wei, et al. Design of privacy-preserving cloud storage framework [C]//Proceedings of the 9th International Conference on Grid and Cloud Computing. Piseataway, NJ, USA:IEEE, 2010: 128-132.
  • 6SATTY T L. Axiomatic foundation of the analytic hierarchy process [J]. Management Science, 1986, 32 (7) - 841-855.
  • 7HAN Jiawei, KAMBER N.Data mining: concepts and techniques [M]. San Francisco, CA, USA:Morgan Kaufmann, 2006.
  • 8Jaffe E, Kirkpatrick S. Architecture of the interact archive [ C ]//Pro- eeedings of the 2nd SYSTOR Conference ,2009.
  • 9Beaver D, Kumar S, Li H C, et al. Finding a needle in Haystack : Face- book's photo storage[ EB/OL]. OSDI,2010.
  • 10Baker M, Hartman J, Kupfer M, et al. Measurements of a Distributed File System [ J ]. ACM Symposium on Operating Systems Principles, 1991:198 -212.

共引文献90

同被引文献82

  • 1崔杰,李陶深,兰红星.基于Hadoop的海量数据存储平台设计与开发[J].计算机研究与发展,2012,49(S1):12-18. 被引量:141
  • 2李三淼,李龙澍.Hadoop中处理小文件的四种方法的性能分析[J].计算机工程与应用.2014.12.30http://www.cnki-net/kcllls/detail/11.2127.TP.20141230.1656.014.html.
  • 3Rajeev Gupta, Himanshu Gupta, Ullas Nambiar, etal. Efficiently querying archived data using Hadoop[C].// 19th ACM conference on Inibrmation and Knowledge Management.2010:1301 - 1304.
  • 4Zhao Xiaoyong.Yang Yang,Sun Lili.et aI.Metadata-aware small files storage architecture on Hadoop[C] .//Web Information Systems and Mining.2012 : 136-143.
  • 5Liu Xuhui.Han Jizhong.Zhong Yunqin.et al.lmplementing WebGIS on Hadoop : A case study ofimporving small file 1/O performance on HDFS[C].//IEEE International Conference on Cluster Computing and Wordshops .2009 : 429-436.
  • 6Bluesky's integraed LiDAR imaging system[J].Highwags. 2012.81 (7):76.
  • 7SATTY T L.Axiomatic foundation o f the analytic hierarchy process [J] .Management Science, 1986 , 32(7):841-855.
  • 8WHITE T.Hadoop权威指南[M].3版.北京:清华大学出版社,2014.
  • 9ARUN C M,VAVILAPALLI V K,EADLINE D,et al.Hadoop YARN权威指南[M].北京:机械工业出版社,2015.
  • 10HOLMES A.Hadoop硬实战[M].北京:电子工业出版社,2015.

引证文献10

二级引证文献43

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部