期刊文献+

云环境下分布式文件系统负载均衡研究 被引量:13

Distributed File System Load Balancing in Cloud Environment
下载PDF
导出
摘要 Hadoop分布式文件系统(Hadoop Distributed File System,HDFS)是一种适合在通用硬件上运行的低成本、高度容错性的分布式文件系统,能提供高吞吐量的数据访问,适合针对大规模数据集上的应用。然而,HDFS中还面临一些性能优化问题,如负载均衡不足。虽然Hadoop系统自带的负载均衡器可以实现均衡调整,但需要用户预先给出静态的阈值。为了解决阈值的固定性和主观性,通过对磁盘空间使用率、CPU利用率、内存利用率、磁盘I/O占用率、网络带宽占用率等参数的分析评估优化,形成对阈值的计算表达式,并通过理论分析和仿真实验对阈值的计算和负载均衡进行验证。实验结果表明,相比较Hadoop静态的输入阈值的算法,该方法达到了更好的平衡效果,提高了计算资源的利用率。 Hadoop Distributed File System(HDFS)is a low-cost, highly fault-tolerant distributed file system that suitable for running on commodity hardware, and offers high-throughput data access for applications on large datasets. However,there are some performance optimization problems in HDFS, such as under-load balancing. Although Hadoop system comes with a load balancer to achieve balanced adjustment, but users need to give a static threshold in advance. In order to solve the fixed threshold and subjectivity, through the analysis, evaluation and optimization of disk space utilization,CPU utilization, memory utilization, the disk I/O occupancy rate, the network bandwidth occupancy rate and other parameters, this paper forms a calculating expression for a threshold, and through the theoretical analysis and simulation experiments, this paper verifies the threshold calculation and load balancing. The experimental results show that this method achieves a better balance effect and improves the utilization of computing resources compared with the Hadoop static input threshold algorithm.
作者 吴瑶瑶 杨庚 WU Yaoyao;YANG Geng(College of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing 210023, China;Jiangsu Key Laboratory of Big Data Security & Intelligent Processing, Nanjing 210023, China)
出处 《计算机工程与应用》 CSCD 北大核心 2019年第10期67-72,224,共7页 Computer Engineering and Applications
基金 国家自然科学基金(No.61572263 No.61502251 No.61502243) 江苏省高校自然科学研究项目(No.14KJB520031) 中国博士后科学基金项目(No.2016M601859) 江苏省自然科学基金面上项目(No.BK20161516)
关键词 云环境 Hadoop分布式文件系统(HDFS) 负载均衡 动态阈值 cloud environment Hadoop Distributed File System(HDFS) load balancing dynamic threshold
  • 相关文献

参考文献6

二级参考文献35

  • 1Deelman E,Chervenak A.Data management challenges of data-intensive scientific workflows//Proceedings of the IEEE International Symposium on Cluster Computing and the Grid(CCGRID).Lyon,France,2008:687-692.
  • 2Deelman E,Blythe J,Gil Y,Kesselman C,Mehta G,Patil S,Su M H,Vahi K,Livny M.Pegasus:Mapping scientific workflows onto the grid//Proceedings of the European Across Grids Conference(AxGrids).Nicosia,Cyprus,2004:11-20.
  • 3Ludascher B,Altintas I,Berkley C,Higgins D,Jaeger E,Jones M,Lee E A.Scientific workflow management and the Kepler system.Concurrency and Computation:Practice and Experience,2005,18(10):1039-1065.
  • 4Oinn T,Addis M,Ferris J,Marvin D,Senger M,Greenwood M,Carver T,Glover K,Pocock M R,Wipat A,Li P.Taverna:A tool for the composition and enactment of bioinformatics workflows.Bioinformatics,2004,20(17):3045-3054.
  • 5Ghemawat S,Gobioff H,Leung S T.The google file system.ACM SIGOPS Operating Systems Review,2003,37(5):29-43.
  • 6Wang L,Tao J,Kunze M,Castellanos A C,Kramer D,Karl W.Scientific cloud computing:Early definition and experience//Proceedings of the 10th IEEE International Conference on High Performance Computing and Communications(HPCC).Dalian,China,2008:825-830.
  • 7Wieczorek M,Prodan R,Fahringer T.Scheduling of scientific workflows in the ASKALON grid environment.SIGMOD Record,2005,34(3):56-62.
  • 8Baru C,Moore R,Rajasekar A,Wan M.The SDSC storage resource broker//Proceedings of the IBMCentre for Advanced Studies Conference.Toronto,Canada,1998:1-12.
  • 9Churches D,Gombas G,Harrison A,Maassen J,Robinson C,Shields M,Taylor I,Wang I.Programming scientific and distributed workflow with Triana services.Concurrency and Computation:Practice and Experience,2006,18:1021-1037.
  • 10Chervenak A,Deelman E,Foster I,Guy L,Hoschek W,Iamnitchi A,Kesselman C,Kunszt P,Ripeanu M,Schwartzkopf B,Stockinger H,Stockinger K,Tierney B.Giggle:A framework for constructing scalable replica location services//Proceedings of the ACM/IEEE Conference on Supercomputing.Baltimore,Maryland,USA,2002:1-17.

共引文献166

同被引文献97

引证文献13

二级引证文献23

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部