摘要
集群文件系统作为一种典型的分布式文件系统类型,通过集群内多个节点的协同,消除了单点故障以及性能瓶颈问题,实现了高可用、高性能以及动态负载均衡,并且具有较高的可扩展性,因此常作为实现和提供云存储服务的关键技术之一.该文针对HDFS集群主要局限于同一数据中心内部部署且可扩展性受限的问题,提出一种跨数据中心集群部署的文件资源调度机制和金蝶分布式文件服务KDFS,通过分布式架构再设计,支持多个HDFS集群动态组网协同工作;通过引入文件资源池,屏蔽了不同集群之间的文件差异性,能够面向多应用提供透明服务;通过引入弹性存储与最优存储策略,确保集群资源安全冗余与就近服务的同时提升了集群的存储效率.实验和实践证明,跨HDFS集群的文件资源调度机制不但解决了HDFS集群可扩展性受限的问题,同时通过跨数据中心部署,实现了集群文件异地冗余灾备、跨数据中心负载均衡以及文件就近存取服务,有效地提高了应用使用KDFS存储服务的体验.
As a typical type of distributed file system, clustered file system, by coordinating several nodes within a cluster, reduces single-point failures and eliminates performance bottlenecks, succeeding in achieving high availability, high performance and dynamic load balance, as well as a relatively high scalability. As a result, it is one of the key technologies used for implementing and providing cloud storage services. This paper improves the limitation that HDFS can often be deployed into only one data center and its limited scalability and then proposes a document resource scheduling mechanism that allows to be deployed across data centers, and proposes Kingdee DistributedFile System(KDFS). By redesigning distributed architecture, it supports dynamic network reconfiguring of multiple HDFS clusters, enabling work coordination within the cluster; meanwhile, this paper introduces file resources pool, shielding the differences among clusters, providing transparent services to multiple applications. This paper also introduces elastic storage and optimal storage strategies, ensuring safety redundancy of cluster resources, near-site services, as well as storage efficiency. In experiments and practices, this inter-HDFS clusters document resource scheduling mechanism not only solved the scalability problem of HDFS, but also succeeded in achieving cluster files offsite disaster recovery, load balancing across data centers, and near-site file storage services, by the way of deploying the system across data centers. These proposed approaches effectively improve the user experience with KDFS storage services.
出处
《计算机学报》
EI
CSCD
北大核心
2017年第9期2093-2110,共18页
Chinese Journal of Computers
基金
国家"九七三"重点基础研究发展规划项目基金(2016YFB1000800)
国家云计算示范工程(发改办高技[2011]2448号)
国家发改委大数据示范工程(发改办高技[2014]648号)
广东省信息产业发展专项(粤经信电软[2015]141号)
广东省领军人才计划([2012]342号)
广东省领军人才计划(粤人才[2012]342号)资助
深圳市技术开发项目(FWY-CX20140310010238)~~
关键词
集群文件系统
分布式文件系统
HDFS
文件资源调度
云存储
clustered document system
distributed file system
HDFS
document resource scheduling
cloud storage