期刊文献+

一种基于Value均值的MapReduce任务分配策略 被引量:1

A Allocation Strategy for MapReduce Task Based on Value-Mean
下载PDF
导出
摘要 在大数据处理中,MapReduce编程思想是处理海量数据中值得借鉴的思想,其计算任务可分为Map任务与Reduce任务。不同类型的数据,其来源与格式不同,处理时不同Key值的List<Value>集合不同,导致不同Reduce任务节点负载不同,体现在集群上为各个节点任务负载不均衡。针对MapReduce中不同Reduce任务节点负载不均衡问题,在Reduce任务中,将同一Key值的List<Value>集合进行均值处理,并根据Key值进行重新划分,提出一种基于Value均值的MapReduce任务分配策略。实验证明,该策略不仅能够提高Reduce任务处理效率,而且具有广泛性。 In large data processing, MapReduce programming is a valuable idea for dealing with massive data. Its computing tasks can be divided into Map tasks and Reduce tasks. Different types of data have different sources and formats. Different sets of List < Value> with different Key values are processed differently, resulting in different Reduce task nodes with different load, which is reflected in the uneven task load of each node on the cluster. Aiming at the problem of unbalanced load among different Reduce task nodes in MapReduce task, this paper proposes a MapReduce task allocation strategy based on Value Mean by means of the same Key value List < Value > set and re-partitioning according to Key value. Experiments show that the strategy can not only improve the efficiency of Reduce task processing, but also has good universality.
作者 薛愈洁 XUE Yujie(Taiyuan University, Taiyuan 030001, China)
机构地区 太原学院
出处 《太原学院学报(自然科学版)》 2019年第1期56-59,共4页 Journal of TaiYuan University:Natural Science Edition
关键词 MAPREDUCE Value均值 任务分配 MapReduce Value-Mean task allocation
  • 相关文献

参考文献5

二级参考文献42

  • 1江小平,李成华,向文,张新访,颜海涛.k-means聚类算法的MapReduce并行化实现[J].华中科技大学学报(自然科学版),2011,39(S1):120-124. 被引量:79
  • 2孙吉贵,刘杰,赵连宇.聚类算法研究[J].软件学报,2008(1):48-61. 被引量:1060
  • 3Zhang Chang-chun. Design and optimize big-data join algorithms u- sing MapReduce[ D ]. Hefei:University of Science and Technology of China,2014.
  • 4Qi Chen, Cheng Liu, Zhen Xiao. Improving MapReduce perform- ance using smart speculative execution strategy [ J ]. IEEE Transac- tions on Computers,2014,63 (4) :954-967.
  • 5Dean J,Ghemawat S. MapReduce:simplified data processing on large clusters[J]. Communications of the ACM,2008,51 (1) :107-113.
  • 6Lin Jimmy. The curse of zipf and limits to paraUelization:a look at the stragglers problem in MapReduce[ C]. In:LSDS-IR Workshop,2009.
  • 7Yong Chul Kwon, Magdalena B. A study of skew in MapReduce applications [ C ]. In : The 5th Open Cirrus Summit,2011.
  • 8Xu Y, Kostamaa P. Efficient outer join data skew handling in paral- lel DBMS [ C ]. Proceedings of the Very Large Data Bases (VLDB) Endocoments, 2009,2 ( 2 ) : 1390-1396.
  • 9Wu L,Zhang C,Meng H,et al. Considering data skew in multiway joins for MapReduce[ C]. ChinaGrid Annual Conference ( China- Grid) ,2013 8th. IEEE,2013:69-73.
  • 10Atta,F,Viglas S D,Niazi S. SAND join--a skew handling join al- gorithm for Google' s MapReduce framework [ C ]. In Multitopic Conference (INMIC) ,2011 IEEE ,2011 : 170-175.

共引文献148

同被引文献2

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部