期刊文献+

海量小文件系统的可移植操作系统接口兼容技术 被引量:5

Portable operating system interface of UNIX compatibility technology in mass small distributed file system
下载PDF
导出
摘要 基于Hadoop分布式文件系统(HDFS)研发的海量小文件系统(SMDFS)遗留了HDFS不兼容可移植操作系统接口(POSIX)约束的问题,为解决SMDFS的这一问题,提出基于本地缓存的POSIX兼容技术和基于数据暂存区的元数据高效管理技术。首先,通过设置数据暂存区来实现读写模式文件流的重定向,然后建立异步线程池模型,实现数据暂存区镜像文件的同步,从而完成用户层到存储层的所有POSIX相关的文件操作。此外,借助跳表结构的元数据缓存实现List目录等元数据操作效率优化。测试表明,相较于HDFS的Linux客户端,基于技术成果实现的SMDFS3.0的随机读性能有10倍以上的性能提升,顺序读和顺序写性能有约3~4倍的提升,随机写性能可以达到本地文件系统的20%,基于目录的元数据缓存的设计使目录的List操作效率提升近10倍。但是,由于用户空间文件系统(FUSE)挂栽的客户端会引入额外的内核态和用户态切换等带来的开销,因此SMDFS3.0的Linux客户端相对于系统的Java接口会有大约50%的性能损耗。 Focused on the issue that the mass small file system developed based on HDFS (Hadoop Distributed File System), SMDFS (Mass Small Distributed File System), is not compatible with POSIX (Portable Operating System Interface of UNIX) constraints, a POSIX compatible technology based on local cache and an efficient metadata management technology based on temporary data cache were proposed. Firstly, the data storage area was set to realize the redirection of the file flow in the read-write mode, and then an asynchronous thread pool model was established to synchronize the data in temporary cache, thereby completing all POSIX-related file operations from the user layer to the storage layer. In addition, with the help of the metadata cache of the skip list structure, the efficiency of metadata operations such as the List directory was optimized. The test results show that, compared to the Linux client of HDFS, the performance of random read improves ten times more, the sequential read and sequential write improves about three to four times. The performance of random write can reach 20% of the local file system. Besides, the List operation efficiency of the directory improves about 10 times. However, due to the additional switching of kernel-mode and use〉mode introduced by FUSE ( Filesystem in Userspace), the Linux client of SMDFS3.0 has a performance penalty of about 50% compared to Java interface.
作者 陈博 何连跃 严巍巍 徐照淼 徐俊 CHEN Bo;HE Lianyue;YAN Weiwei;XU Zhaomiao;XU Jun(College of Computer,National University of Defense Technology,Changsha Hunan 410073,China;Beijing Netclouds Information Technology Corporation Limited,Beijing 100070,China)
出处 《计算机应用》 CSCD 北大核心 2018年第5期1389-1392,1398,共5页 journal of Computer Applications
关键词 海量小文件系统 分布式文件系统 可移植操作系统接口兼容 元数据缓存 云存储 mass small file system distributed file system Portable Operating System Interface of UNIX (POSIX) compatibility metadata cache cloud storage
  • 相关文献

参考文献4

二级参考文献35

  • 1王华,马亮,顾明.线程池技术研究与应用[J].计算机应用研究,2005,22(11):141-142. 被引量:44
  • 2高正光,李启炎.一种多线程并发环境下的对象缓存模型[J].计算机工程,2005,31(22):104-106. 被引量:8
  • 3金海,罗飞,章勤,张浩.一个基于P2P高性能计算的高效数据传输协议[J].计算机研究与发展,2006,43(9):1543-1549. 被引量:4
  • 4李刚,金蓓弘.基于线程的并发控制技术研究与应用[J].计算机工程,2007,33(14):43-45. 被引量:16
  • 5Ling Y,Mullen T,Lin X. Analysis of optimal thread pool size[ C]// New York, USA: ACM Operating Systems Review, SIGOPS, ACM Press, 2000,34(2) :42-55.
  • 6Brian Goetz.Java并发编程实践[M].北京:电子工业出版社.2007.
  • 7Armbrust M, Fox A. Griffith R, et al. Above the Clouds: A Berkeley View of Cloud Computing[ D ]. UCB/EECS-2009-28, EECS Department, University of California, Berkeley, 2009.
  • 8Tom White. Hadoop: The Definitive Guide[M]. 2rid ed. O' Reilly Media, Inc ,2011.
  • 9Konstantin Shvachko , Hairing Kuang , Sanyjy Radia , et al. The Ha- doop Distributed File System [ C ]//Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), May 03 -07, 2010:1 -10.
  • 10Hadooparchives[ OL]. http ://hadoop. apache. org/common/docs/current/hadoop_ archives. html.

共引文献103

同被引文献49

引证文献5

二级引证文献14

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部