摘要
针对HDFS集群默认负载均衡策略中判定阈值的设定具有主观性、滞后性以及衡量指标单一性等问题,通过分析HDFS集群架构搭建、HDFS集群处理对象以及集群本身动态实时变化等特点,结合研究一些负载均衡策略中阈值改进算法,提出一种基于预测模型估算文件属性并结合集群属性综合计算阈值的方法,并将该方法计算出的阈值代入相应的负载均衡策略中进行负载优化。通过实验结果分析表明,提出的基于预估模型估算文件属性具有很高的准确性,并且基于预估模型的负载均衡策略对于集群负载的调整具有高效性,同时能够进一步缩短集群的作业执行响应时间,提高集群作业效率。
In view of the subjectivity,lagness and singleness of metric of the decision threshold in the default load balancing strategy of HDFS cluster,by analyzing the characteristics of HDFS clustering architecture,HDFS cluster processing objects and the cluster itself dynamic real-time changes,combined with some of the load balancing strategy threshold improvement algorithm,this paper presented a prediction model based on the estimated file attributes combined with cluster attributes comprehensive calculation threshold method. The threshold calculated by this method was substituted into the corresponding load balancing strategy for load optimization. The experimental results showed that the proposed method based on the estimated model to estimate the file attributes had high accuracy and the load balancing strategy based on the prediction model was efficient in adjusting the load of the cluster. At the same time,it could further shorten the execution time of cluster operations and improved the efficiency of cluster operations.
作者
于磊春
陈健美
刘响
胡杨
Yu Leichun;Chen Jianmei;Liu Xiang;Hu Yang(School of Computer Science and Telecommunication Engineering, Jiangsu University, Zhenfiang 212013, Jiangsu, China)
出处
《计算机应用与软件》
北大核心
2018年第5期149-156,201,共9页
Computer Applications and Software
关键词
HDFS集群
预测模型
文件属性
集群属性
阈值
负载均衡
HDFS cluster
Prediction model
File attributes
Cluster attributes
Threshold
Load balancing