期刊文献+

一种跨HDFS集群的文件资源调度机制 被引量:5

A Document Resource Scheduling Mechanism for HDFS Cluster Across Multiple Data Centers
下载PDF
导出
摘要 集群文件系统作为一种典型的分布式文件系统类型,通过集群内多个节点的协同,消除了单点故障以及性能瓶颈问题,实现了高可用、高性能以及动态负载均衡,并且具有较高的可扩展性,因此常作为实现和提供云存储服务的关键技术之一.该文针对HDFS集群主要局限于同一数据中心内部部署且可扩展性受限的问题,提出一种跨数据中心集群部署的文件资源调度机制和金蝶分布式文件服务KDFS,通过分布式架构再设计,支持多个HDFS集群动态组网协同工作;通过引入文件资源池,屏蔽了不同集群之间的文件差异性,能够面向多应用提供透明服务;通过引入弹性存储与最优存储策略,确保集群资源安全冗余与就近服务的同时提升了集群的存储效率.实验和实践证明,跨HDFS集群的文件资源调度机制不但解决了HDFS集群可扩展性受限的问题,同时通过跨数据中心部署,实现了集群文件异地冗余灾备、跨数据中心负载均衡以及文件就近存取服务,有效地提高了应用使用KDFS存储服务的体验. As a typical type of distributed file system, clustered file system, by coordinating several nodes within a cluster, reduces single-point failures and eliminates performance bottlenecks, succeeding in achieving high availability, high performance and dynamic load balance, as well as a relatively high scalability. As a result, it is one of the key technologies used for implementing and providing cloud storage services. This paper improves the limitation that HDFS can often be deployed into only one data center and its limited scalability and then proposes a document resource scheduling mechanism that allows to be deployed across data centers, and proposes Kingdee DistributedFile System(KDFS). By redesigning distributed architecture, it supports dynamic network reconfiguring of multiple HDFS clusters, enabling work coordination within the cluster; meanwhile, this paper introduces file resources pool, shielding the differences among clusters, providing transparent services to multiple applications. This paper also introduces elastic storage and optimal storage strategies, ensuring safety redundancy of cluster resources, near-site services, as well as storage efficiency. In experiments and practices, this inter-HDFS clusters document resource scheduling mechanism not only solved the scalability problem of HDFS, but also succeeded in achieving cluster files offsite disaster recovery, load balancing across data centers, and near-site file storage services, by the way of deploying the system across data centers. These proposed approaches effectively improve the user experience with KDFS storage services.
出处 《计算机学报》 EI CSCD 北大核心 2017年第9期2093-2110,共18页 Chinese Journal of Computers
基金 国家"九七三"重点基础研究发展规划项目基金(2016YFB1000800) 国家云计算示范工程(发改办高技[2011]2448号) 国家发改委大数据示范工程(发改办高技[2014]648号) 广东省信息产业发展专项(粤经信电软[2015]141号) 广东省领军人才计划([2012]342号) 广东省领军人才计划(粤人才[2012]342号)资助 深圳市技术开发项目(FWY-CX20140310010238)~~
关键词 集群文件系统 分布式文件系统 HDFS 文件资源调度 云存储 clustered document system distributed file system HDFS document resource scheduling cloud storage
  • 相关文献

参考文献4

二级参考文献9

  • 1黄伟强,孟克勋.VRRP路由协议的应用[J].华南师范大学学报(自然科学版),2004,36(4):53-58. 被引量:13
  • 2WhiteL周敏奇,王晓玲,金澈清,钱卫宁译.Hadoop权威指南第2版.北京:清华大学出版社,2011.41-73.
  • 3HDF S .http://Hadoop.apache.org/hdfs/.
  • 4Mackey C~ Sehrish S, Wang J. Improving metadata manage- ment for small files in HDFS. Proc. of 2009 IEEE International Conference on Cluster Computing and Work- shops,2009:1-4.
  • 5Liu XH, Hall JZ, Zhong YQ, Hail CD, He XB. Implementing WebGIS on Hadoop: A case study of improving small file I/O performance on HDFS. Proc.of the 2009 IEEE Conf.on Cluster Computing and Workshops, 2009:1-8.
  • 6Dong B, Qiu J, Zheng QH, et al. A nivel approach to improving the efficiency of storing and accessing small files on hadoop: a case study by PowerPoint files. Proc. of the 7th Int. Conf. on Services Computing. Piscataway, NJ, USA: IEEE, 2010: 65-72.
  • 7Hadoop Archives. http://hadoop.apache.o-rg/common/docs/r0. 20.2/hadoop_archive.
  • 8Sequence File.http://wiki.apache.org/hadoop/SequenceFile.
  • 9冯玉才,王冬敏,朱虹.多服务器热备份机制的设计和实现[J].华中科技大学学报(自然科学版),2003,31(2):7-8. 被引量:3

共引文献29

同被引文献37

引证文献5

二级引证文献14

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部