摘要
Map Reduce编程模型是分布式计算中最常用的编程模型,其主要目的是将单个巨大计算任务分割成多个小计算任务,并分别交由不同的计算机去处理。Map Reduce将任务分成map阶段和reduce阶段,每个阶段都是用key/value键值对作为输入和输出。针对Map Reduce中Map数量少,Reduce数量多的情况,文章将Map阶段任务中的Key值进行二次划分,提出一种Map Reduce编程模型中Key二次分类的方法。实验,证明该方法能够在原有基础上提高数据处理效率。
MapReduce programming model is the most commonly used programming model in the distributed computing.It divides a single huge computing task into multiple small computing tasks,which are processed by different computers respectively.MapReduce divides the task into the Map phase and the Reduce phase,each of which is used as input and output with the key/value key value pair.In view of the fact that the number of Map in MapReduce is small and the number of Reduce is large,the Key value of Mapphase task is divided in two times,and a method of two times classification of Key value in MapReduce programming model is proposed.Experiments show that the method can improve the efficiency of data processing on the original basis.
作者
刘帅
Liu Shuai(Department of Computer Application, Xinzhou Vocational and Technical College, Xinzhou, Shanxi 034000, China)
出处
《计算机时代》
2018年第3期58-59,62,共3页
Computer Era