期刊文献+

一种面向海量存储系统的高效元数据集群管理方案 被引量:3

A High Performance Management Schema of Metadata Clustering for Large-Scale Data Storage Systems
下载PDF
导出
摘要 高效的、去中心化的元数据管理方案对大型分布式存储系统的可靠性、可扩展性起至关重要的作用.针对基于Hash划分和基于子树划分的元数据管理方案扩展代价巨大、对集群变动敏感等问题,提出一种基于一致性Hash结构的元数据服务器(metadata server,MDS)集群化方案——CH-MMS(consistent Hash based metadata management schema).CH-MMS在一致性MDS集群上引入虚拟MDS(Virtual MDS),有效平衡MDS集群负载;将Standby机制与延迟更新策略融合并应用于MDS集群,实现MDS快速失效恢复以及集群变动时零数据迁移量.阐述了CH-MMS的体系结构,介绍了核心数据结构layout-table、虚拟MDS结构、延迟更新机制及相关算法,并对CH-MMS扩展性、容错性作了定性分析.最后通过原型系统和模拟实验说明,CH-MMS具有元数据平衡分布、快速失效恢复、灵活的扩展性以及零结点变动数据迁移量等特点,能满足数据量不断增加的大规模存储集群元数据灵活、高效管理的需求. An efficient ,decentralized metadata management schema plays a vital role in large‐scale distributed storage systems . T he Hash‐based partition schema and tree‐based partition schema pay huge cost for expansion ,and are sensitive to changes in cluster .In response to these problems ,CH‐MMS(consistent Hash based metadata management schema) ,is proposed .Virtual MDS (metadata server) is introduced in CH‐MMS ,and good effect for the cluster's load balance is proved .Combining the standby mechanism with lazy‐update policy ,CH‐MMS achieves fast failover and zero migration when the cluster changes .Due to its distributed metadata structure ,CH‐MMS has fast metadata lookup speed .In order to solve the problem that the Hash structure will cause damage to file system hierarchical semantics ,a simple and flexible mechanism based on regular expression matching has been introduced .The following work is presented in the paper :1)Expound the architecture of CH‐MMS ;2)Introduce the core data structure of layout‐table ,virtual MDS and lazy‐update policy ,and their relevant algorithms ;3 ) Qualitatively analyze scalability and fault tolerance . The prototype system and simulation show that ,CH‐MMS is metadata‐balancing and has fast failover ,flexible expansion and zero migration when cluster changes .CH‐MMS can meet the needs of flexible ,efficient metadata management of large‐scale storage systems with increasing data .
出处 《计算机研究与发展》 EI CSCD 北大核心 2015年第4期929-942,共14页 Journal of Computer Research and Development
基金 国家自然科学基金项目(61063012 61363003) 广西自然科学基金项目(2012GXNSFAA053222) 广西高校优秀人才资助计划项目([2011]40) 广西科学研究与技术开发计划项目(桂科软13180015 桂科攻1348020-7) 南宁市科学研究与技术开发计划项目(201109016A)
关键词 元数据管理 一致性Hash 大数据存储 元数据服务器 分布式文件系统 metadata management consistent Hash large-scale data storage metadata server(MDS) distributed file system
  • 相关文献

参考文献25

  • 1Traeger A, Zadok E, Joukov N, et al. A nine year study of file system and storage benchmarking [J]. ACM Trans on Storage, 2008, 4(2): 1-56.
  • 2Roselli D S, Lorch J R, Anderson T E. A comparison of file system workloads [C] //Proc of the 2000 USENIX Annual Technical[ Conf. Berkeley, CA: USENIX Association, 2000: 41-54.
  • 3Well S A, Pollack K T, Brandt S A, et al. Dynamic metadata management for petabyte-scale file systems [C] // Proc of tlhe 2004 ACM/IEEE Conf on Supercomputing. Los Alamitos: IEEE Computer Society, 2004 :1-12.
  • 4Hua Yuhua, Zhu Yifeng, Jiang Hong, et al. Scalable and adaptive metadata management in ultra large-scale file systems [C] /]Proc of the 28th Int Conf on Distributed Computing Systems. Los Alamitos: IEEE Computer Society, 2008:403-410.
  • 5Karger D R, Ruhl M. Simple efficient load balancing algorithms for peer-to-peer systems [C] //Proc of the 16th Annual ACM Symp on Parallelism in Algorithms and Architectures. New York: ACM, 2004:36-43.
  • 6Stoica I, Morris R, Karger D, et al. Chord: A scalable peer- to-peer lookup service for Internet applications [J]. ACM SIGCOMM Computer Communication Review, 2001, 31(4) : 149-160.
  • 7Karger D, Lehman E, Leighton T, et al. Consistent hashing and random trees Distributed caching protocols for relieving hot spots on the world wide Web [C] //Proc of the 29th Annual ACM Symp on Theory of Computing. New York: ACM, 1997:654-663.
  • 8Ghemawat S, Gobioff H, Leung S T. The google file system [C] //Proc of the 19th ACM Symp on Operating Systems Principles. New York: ACM, 2003:29-43.
  • 9Shvachko K, Kuang H, Radia S, et al. The hadoop distributed file system [C] //Proc of the 26th IEEE Symp on Mass Storage Systems and Technologies. Los Alamitos IEEE Computer Society, 2010:1-10.
  • 10Borthakur D, Gray J, Sarma J S, et al. Apache hadoop goes realtime at Faeebook [C] //Proc of the 2011 ACM SIGMOD Int Conf on Management of Data. New York: ACM, 2011 1071-1080.

同被引文献39

引证文献3

二级引证文献13

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部