We have witnessed the fast-growing deployment of Hadoop,an open-source implementation of the MapReduce programming model,for purpose of data-intensive computing in the cloud.However,Hadoop was not originally designed ...We have witnessed the fast-growing deployment of Hadoop,an open-source implementation of the MapReduce programming model,for purpose of data-intensive computing in the cloud.However,Hadoop was not originally designed to run transient jobs in which us ers need to move data back and forth between storage and computing facilities.As a result,Hadoop is inefficient and wastes resources when operating in the cloud.This paper discusses the inefficiency of MapReduce in the cloud.We study the causes of this inefficiency and propose a solution.Inefficiency mainly occurs during data movement.Transferring large data to computing nodes is very time-con suming and also violates the rationale of Hadoop,which is to move computation to the data.To address this issue,we developed a dis tributed cache system and virtual machine scheduler.We show that our prototype can improve performance significantly when run ning different applications.展开更多
文摘We have witnessed the fast-growing deployment of Hadoop,an open-source implementation of the MapReduce programming model,for purpose of data-intensive computing in the cloud.However,Hadoop was not originally designed to run transient jobs in which us ers need to move data back and forth between storage and computing facilities.As a result,Hadoop is inefficient and wastes resources when operating in the cloud.This paper discusses the inefficiency of MapReduce in the cloud.We study the causes of this inefficiency and propose a solution.Inefficiency mainly occurs during data movement.Transferring large data to computing nodes is very time-con suming and also violates the rationale of Hadoop,which is to move computation to the data.To address this issue,we developed a dis tributed cache system and virtual machine scheduler.We show that our prototype can improve performance significantly when run ning different applications.