摘要
针对MapReduce系统中负载能耗特征多样性为系统成本调度带来的负载与节点难以匹配的问题,提出一种基于负载性能特征的能耗估计方法。该方法以MapReduce系统中各节点操作系统的性能事件为依据估计在线负载的能耗。为了提升负载能耗估计结果的准确度,采用机器学习的方法,在负载执行时,搜集系统的性能特征,并建立估计模型的样本集;采用粗糙集理论中属性约简方法对性能特征属性进行约简;在性能属性约简的结果之上,基于支持向量机理论,建立能耗的估计模型,对负载运行时系统的能耗进行准确的估计。实验结果表明:基于性能特征的能耗估计方法拥有较高的估计准确率,在单作业环境中平均相对误差为4%,在多作业环境中可达到4.5%。
It is difficult to improve the energy efficiency of MapReduce clusters by matching active nodes to the needs of the workload since it is difficult to capture the features of energy consumption for cost-based scheduler for different types of workloads. A power estimation method based on performance features of workloads is proposed to solve the problem. The method estimates the power consumption by leveraging performance monitoring counters on components of worker nodes during MapReduce jobs execution. A machine learning method is used to improve the estimation accuracy. The performance monitoring counters of MapReduce system are collected to build a sample set, and then the rough set method is used to select the performance attributes that show strong impact on the energy consumption of workloads. A power estimation model based on the least square support vector machines is built from the attribute reduction results. Experimental results show that the energy estimation method accurately forecasts the power consumption of workloads in MapReduce systems. The relative error of accuracy for power prediction is 4% for only one running job and 4.5% for jobs sharing MapReduce clusters.
出处
《西安交通大学学报》
EI
CAS
CSCD
北大核心
2015年第2期14-19,共6页
Journal of Xi'an Jiaotong University
基金
国家自然科学基金资助项目(61202041
91330117)
国家高技术研究发展计划资助项目(2011AA01A204
2012AA01A306)