摘要
异构环境下的Hadoop平台对reduce任务的调度存在随机性,在分配任务时既没有考虑数据本地性,也没有考虑计算节点对当前任务的计算能力。针对以上问题,提出一种异构环境下自适应reduce任务调度算法(SARS)。算法根据reduce任务的输入数据分布选择所含数据量最大的机架,在选择计算节点的过程中,结合节点所含任务的数据量、节点的计算能力和当前节点的忙碌状态来选出任务的执行节点。实验结果表明,SARS算法减少了reduce任务执行时的网络开销,同时也减少了reduce任务的执行时间。
The scheduling of reduce tasks in Hadoop platform under heterogeneous environment is random,which neither consider the data locality nor the computing power of the computing node when assigning tasks. To solve the above problems,this paper proposed a self-adaptive reduce task scheduling algorithm( SARS),which selected the rack with the largest amount of data according to the input data distribution of the reduce task. In the process of selecting the computing node,selecting the execution node of the task bases on the amount of data contained in the node,the computing power of the node and the busy state of the current node. The experimental results show that the SARS algorithm reduces the network overhead and the execution time of the reduce task.
作者
付彦卓
张树东
李辉
Fu Yanzhuo;Zhang Shudong;Li Hui(College of Information Engineering,Capital Normal University,Beijing 100048,China)
出处
《计算机应用研究》
CSCD
北大核心
2018年第7期1989-1991,2000,共4页
Application Research of Computers
基金
国家自然科学基金资助项目(31571563)
高可靠嵌入式系统技术北京市工程研究中心资助项目(2013BAH19F01)