期刊文献+
共找到1篇文章
< 1 >
每页显示 20 50 100
A Hadoop Performance Prediction Model Based on Random Forest
1
作者 Zhendong Bei Zhibin Yu +4 位作者 Huiling Zhang Chengzhong Xu shenzhong feng Zhenjiang Dong Hengsheng Zhang 《ZTE Communications》 2013年第2期38-44,共7页
MapReduce is a programming model for processing large data sets, and Hadoop is the most popular open-source implementation of MapReduce. To achieve high performance, up to 190 Hadoop configuration parameters must be m... MapReduce is a programming model for processing large data sets, and Hadoop is the most popular open-source implementation of MapReduce. To achieve high performance, up to 190 Hadoop configuration parameters must be manually tunned. This is not only time-consuming but also error-pron. In this paper, we propose a new performance model based on random forest, a recently devel- oped machine-learning algorithm. The model, called RFMS, is used to predict the performance of a Hadoop system according to the system' s configuration parameters. RFMS is created from 2000 distinct fine-grained performance observations with different Hadoop configurations. We test RFMS against the measured performance of representative workloads from the Hadoop Micro-benchmark suite. The results show that the prediction accuracy of RFMS achieves 95% on average and up to 99%. This new, highly accurate prediction model can be used to automatically optimize the performance of Hadoop systems. 展开更多
关键词 big data cloud computing MAPREDUCE HADOOP random forest micro-benchmark
下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部