期刊文献+

Method for improving MapReduce performance by prefetching before scheduling

Method for improving MapReduce performance by prefetching before scheduling
下载PDF
导出
摘要 In this paper, a prefetching technique is proposed to solve the performance problem caused by remote data access delay. In the technique, the map tasks which will cause the delay are predicted first and then the input data of these tasks will be preloaded before the tasks are scheduled. During the execution, the input data can be read from local nodes. Therefore, the delay can be hidden. The technique has been implemented in Hadoop-0. 20.1. The experiment results have shown that the technique reduces map tasks causing delay, and improves the performance of Hadoop MapRe- duce by 20%.
出处 《High Technology Letters》 EI CAS 2012年第4期343-349,共7页 高技术通讯(英文版)
关键词 cloud computing distributed computing PREFETCHING MAPREDUCE SCHEDULING 预取技术 性能问题 调度 远程数据访问 输入数据 执行过程 延迟 预加载
  • 相关文献

参考文献17

  • 1Dean J, Ghemawat S. MapReduce: Simplified data pro- cessing on large clusters. In: Proceedings of USENIX Symposium on Operating Systems Design and Implementa- tion, San Francesco, USA, 2004. 137-150.
  • 2Hadoop. http ://hadoop. apache, org: Apache, 2010.
  • 3Amazon Elastic MapReduce. http ://aws. amazon, corn/ elasticmapreduce: Amazon, 2010.
  • 4Wegener D, Mock M, Adranale D, et al. Toolkit-based high-performance data mining of large data on MapReduce clusters. In: Proceedings of International Conference on Data Mining Workshops, Miami, USA, 2009. 296-301.
  • 5Zhang S, Han J, Liu Z, et al. Spatial queries evaluation with MapReduee. In: Proceedings of 8th International Conference on Grid and Cooperative Computing, Lanzhou, China, 2009. 287-292.
  • 6Leo S, Santoni F, Zanetti G. Biodoop: Bioiifformatics onHadoop. In: Proceedings of International Conference on Parallel Processing Workshops, Vienna, Austria, 2009. 415-422.
  • 7Zaharia M, Borthakur D, Sarma J, et al. Job Scheduling for Multi-User MapReduce Clusters, http ://www. eecs. berkeley, edu, 2009.
  • 8Byna S, Chen Y, Sun X. A taxonomy of data prefetching mechanisms. In: Proceedings of the International Sympo- sium on Pervasive Systems, Algorithms, and Networks, Washington DC, USA, 2008.19-24.
  • 9Byna S, Chen Y, Sun X. Parallel I/O prefetching using MPI file caching and I/O signatures. In: Proceedings of the ACM/IEEE Conference on Supercomputing, Austin, USA, 2008. 1-12.
  • 10Chang F, Gibson G. Automatic I/O hint generation through speculative execution. In: Proceedings of USE- NIX Symposium on Operating Systems Design and Imple- mentation, New Orleans, USA, 1999. 1-14.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部