摘要
MapReduce是一种新型的并行计算框架,在计算速度,容错性,可靠性等方面具有优势,因此得到了广泛的商业应用与科学研究。而调度算法作为MapReduce的核心组成部分,它的优劣成为了直接影响MapReduce性能的关键因素,因而得到了很大的关注。在介绍和分析MapReduce并行计算模型的基础上,介绍了几种相关的模型改进,并基于Hadoop平台,重点研究了MapReduce的常用调度算法及改进算法。通过对比分析,就MapReduce未来的发展进行了进一步的探讨,为其调度算法的改进提供有效的方法。
MapReduce is a new parallel computing framework which has lots of advantages in many aspects, such as computing speed, fault tolerance and reliability, so it has been widely used in business applications and scientific research. The advantages and disadvantages of the scheduling algorithm, as the core component of MapReduce, have attracted a lot of attention because they directly affect the performance of MapReduce. After introducing and analysing the MapReduce parallel computing model, we present several related improved models, and on Hadoop-based platform we put the focus on studying common MapReduce scheduling algorithm and its improved algorithm. By comparative analysis, we further investigate the future development of MapReduce scheduling algorithm and provide an effective way to improve it.
出处
《计算机应用与软件》
CSCD
2015年第5期1-6,16,共7页
Computer Applications and Software
基金
国家科技部重大科技支撑计划项目(2011BAK21B05)
江苏基础研究计划(自然科学基金)项目(BK2012363)
江苏省工业和信息产业转型升级专项引导资金项目(2011C1)