期刊文献+

基于数据局部性的推测式Hadoop任务调度算法研究 被引量:9

Speculative task scheduling algorithm based on locality of data in Hadoop
下载PDF
导出
摘要 针对Hadoop平台现有任务调度算法优化程度不高的问题,提出了一种基于数据局部性的推测式任务调度算法。该算法通过计算节点上Map和Reduce任务时长比例,结合不同节点上数据的局部特性,采用了比现有算法更精确的任务进度探测方式找出快慢节点,在快节点上启动剩余时间最长的落后任务的备份任务,用移动计算代替移动数据。在Hadoop环境中进行了实验,结果表明该算法比现有算法缩短了任务平均运行时间,加快了任务的执行效率。 For the reason that the existing algorithm on Hadoop doesnt have a high level of optimization, this paper presented a novel task scheduling algorithm based on data locality speculation. By calculating the time duration ratio of Map and Reduce task on each node combined with the local characteristics of tasks and data on different nodes, the algorithm introduced a more accurate task detection mechanism, and then launched backup tasks of slow tasks on fast nodes. For using computing migration instead of data migration, the algorithm can obtain higher efficiency. Experimental results in Hadoop show that compared with the existing scheduling algorithm, the algorithm proposed in this paper can shorten the task average operation time and reduce the network congestion caused by data exchange between cluster racks. It also can speed up the task execution efficiency.
出处 《计算机应用研究》 CSCD 北大核心 2014年第1期182-187,共6页 Application Research of Computers
基金 国家自然科学基金资助项目(61070162 71071028)
关键词 HADOOP 任务调度 异构环境 数据局部性 Hadoop job scheduling heterogeneous environments locality of data
  • 相关文献

同被引文献65

  • 1罗红兵,张晓霞,魏勇.大规模并行计算机作业调度评价[J].计算机工程与应用,2006,42(10):79-83. 被引量:3
  • 2Wikipedia.Apache Hadoop[EB/OL].[2014-07-08].http://en.wikipedia.org/wiki/Apache_Hadoop.
  • 3ZAHARIA M.Job scheduling with the fair and capacity schedulers[EB/OL].[2014-07-10].http://www.cs.berkeley.edu/-matei/talks/2009/hadoop_summit_fair_scheduler.pdf.
  • 4The Apache Software Foundation.Capacity scheduler guide[EB/OL].[2014-06-08].http://hadoop.apache.org/docs/r1.2.1/capacity_scheduler.html.
  • 5ZAHARIA M,BORTHAKUR D,SARMA J S,et al.Job scheduling optimization for multi-user MapReduce clusters:UCB/EECS-2009-55[R].Berkeley:University of California,2009:1-16.
  • 6The Apache Software Foundation.Fair scheduler[EB/OL].[2014-06-08].http://hadoop.apache.org/docs/r1.2.1/fair_scheduler.html.
  • 7FISCHER M J,SU X,YIN Y.Assigning tasks for efficiency in Hadoop:extended abstract[C]//Proceedings of the 22nd Annual ACM Symposium on Parallelism in Algorithms and Architectures.New York:ACM,2010:30-39.
  • 8GE Y,WEI G.GA-based task scheduler for the cloud computing systems[C]//Proceedings of the 2010 International Conference on Web Information Systems and Mining.Washington,DC:IEEE Computer Society,2010,2:181-186.
  • 9ZAHARIA M,KONWINSKI A,JOSEPH A D,et al.Improving MapReduce performance in heterogeneous environments[C]//Proceedings of the 8th USENIX Symposium on Operating Systems Design Implementation.Berkeley,CA:USENIX Association,2008:29-42.
  • 10KC K,ANYANWU K.Scheduling Hadoop jobs to meet deadlines[C]//Proceedings of the 2nd IEEE International Conference on Cloud Computing Technology and Science.Washington,DC:IEEE Computer Society,2010:388-392.

引证文献9

二级引证文献21

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部