摘要
针对Hadoop平台现有任务调度算法优化程度不高的问题,提出了一种基于数据局部性的推测式任务调度算法。该算法通过计算节点上Map和Reduce任务时长比例,结合不同节点上数据的局部特性,采用了比现有算法更精确的任务进度探测方式找出快慢节点,在快节点上启动剩余时间最长的落后任务的备份任务,用移动计算代替移动数据。在Hadoop环境中进行了实验,结果表明该算法比现有算法缩短了任务平均运行时间,加快了任务的执行效率。
For the reason that the existing algorithm on Hadoop doesnt have a high level of optimization, this paper presented a novel task scheduling algorithm based on data locality speculation. By calculating the time duration ratio of Map and Reduce task on each node combined with the local characteristics of tasks and data on different nodes, the algorithm introduced a more accurate task detection mechanism, and then launched backup tasks of slow tasks on fast nodes. For using computing migration instead of data migration, the algorithm can obtain higher efficiency. Experimental results in Hadoop show that compared with the existing scheduling algorithm, the algorithm proposed in this paper can shorten the task average operation time and reduce the network congestion caused by data exchange between cluster racks. It also can speed up the task execution efficiency.
出处
《计算机应用研究》
CSCD
北大核心
2014年第1期182-187,共6页
Application Research of Computers
基金
国家自然科学基金资助项目(61070162
71071028)