期刊文献+

SFFS:低延迟的面向小文件的分布式文件系统 被引量:5

SFFS:Low-Latency Small-File-Oriented Distributed File System
下载PDF
导出
摘要 社交网站和电子商务等网络服务发展迅速,这类服务需要存储大量图片、音乐、微博文本等小文件。传统的分布式存储系统,如HDFS(Hadoop distributed file system),是面向大文件而设计的,在存储小文件时会产生元数据开销过大,访问延迟较高等问题,不能适应存储海量小文件的应用环境。分析了TFS(Taobao file system)的系统架构和读写流程,发现TFS在每次读/写过程中至少要建立3次网络连接,增大了读写延迟。针对海量小文件存储带来的挑战和TFS存在的问题,提出了一种新的低延迟、高可用的面向海量小文件的分布式存储方案,并实现了分布式文件系统SFFS(small-file file system)。性能测试表明,SFFS和TFS相比,写延迟降低了76.6%,读延迟降低了约10%。通过对系统结构的分析,相比于TFS,SFFS在中心节点的负载更轻,失效恢复更快,在可用性方面更有优势。 SNS (social networking services) and E-commerce services developed rapidly. Such services need store numerous small files like pictures, music files and macro blog texts. Traditional distributed storage systems, such as HDFS (Hadoop distributed file system), are designed for large files, which will have problems such as too much over-head with metadata and high latency when dealing with large number of small files. This paper analyzes the architec-ture and read-write flow of TFS (Taobao file system), and finds that TFS has to build several network connections when writing or reading a small file, which increases the read-write latency. Aiming at the challenge of storing numerous small files and the problems of TFS, this paper proposes SFFS (small-file file system), a low-latency high availability small-file-oriented distributed storage. The performance experiments show that the write latency of SFFS decreases 76.6%, and the read latency of SFFS decreases about 10%compared with TFS. SFFS also has a higher availability than TFS since the center node in SFFS has lighter load and can recover more quickly.
出处 《计算机科学与探索》 CSCD 2014年第4期438-445,共8页 Journal of Frontiers of Computer Science and Technology
基金 国家自然科学基金Grant No.61272167 国家高技术研究发展计划(863计划)Grant No.2011AA01A204 国家科技重大专项"核高基"项目Grant No.2012ZX01039-004~~
关键词 小文件 低延迟高可用 分布式存储 small file low-latency high availability distributed storage
  • 相关文献

参考文献15

  • 1Beaver D, Kumar S, Li H C, et al. Finding a needle in hay- stack: Facebook' s photo storage[C]//Proceedings of the 9th USENIX Symposium on Operating Systems Design and Imple- mentation (OSDI '10), Vancouver, Canada, Oct 4-6, 2010. Berkeley, CA, USA: USENIX, 2010.
  • 2Mackey G, Sehrish S, Wang Jun. Improving metadata man- agement for small files in HDFS[C]//Proceedings of the 2009 IEEE International Conference on Cluster Computing and Workshops (CLUSTER '09), New Orleans, USA, 2009. Piscataway, NJ, USA: IEEE, 2009: 1-4.
  • 3Baker M, Hartman J, Kupfer M, et al. Measurements of a distributed file system[C]//Proceedings of the 13th ACM Symposium on Operating Systems Principles (SOSP '91). New York, NY, USA: ACM, 1991 : 198-212.
  • 4Cloudera small file problem[EB/OL]. [2013-08-16]. http:// blog.cloudera.com/blog/2009/02/the-small-files-problem/.
  • 5White T. Hadoop: the definitive guide[M]. [S.1.]: O'Reilly Media, Inc, 2009.
  • 6Patil S, Gibson G A. Scale and concurrency of GIGA+: file system directories with millions of files[C]//Proceedings of the 9th USENIX Conference on File and Storage Technologies (FAST), San Jose, USA, Feb 15-17, 2011. Berkeley, CA, USA: USENIX, 2011.
  • 7TFS MetaServer[EB/OL]. [2013-08-16]. http://code.taobao. org/p/t fs/wiki/metaservedr/.www.redisbook.com/en/latest/. Castillo X, Siewiorek D P. A workload dependent software.
  • 8Karger D, Lehman E, Leighton 1", et al. Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web[C]//Proceedings of the 29th Annual ACM Symposium on Theory of Computing, E1 Paso, USA, May 4-6, 1997.New York, NY, USA: ACM, 1997: 654-663.
  • 9DeCandia G, Hastorun D, Jampani M, et al. Dynamo: Am- azon's highly available key-value store[C]//Proceedings of the 21st ACM SIGOPS Symposium on Operating Systems Principles (SOSP "07), Stevenson, USA, Oct 14-17, 2007. New York, NY, USA: ACM, 2007: 205-220.
  • 10Redis design and implements[EB/OL]. [2013-08-16]. http:// www.rcdisbook.com/en/latest/.

同被引文献33

引证文献5

二级引证文献17

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部