期刊文献+
共找到1篇文章
< 1 >
每页显示 20 50 100
Accelerating Iterative Big Data Computing Through MPI 被引量:5
1
作者 梁帆 鲁小亿 《Journal of Computer Science & Technology》 SCIE EI CSCD 2015年第2期283-294,共12页
Current popular systems, Hadoop and Spark, cannot achieve satisfied performance because of the inefficient overlapping of computation and communication when running iterative big data applications. The pipeline of com... Current popular systems, Hadoop and Spark, cannot achieve satisfied performance because of the inefficient overlapping of computation and communication when running iterative big data applications. The pipeline of computing, data movement, and data management plays a key role for current distributed data computing systems. In this paper, we first analyze the overhead of shuffle operation in Hadoop and Spark when running PageRank workload, and then propose an event-driven pipeline and in-memory shuffle design with better overlapping of computation and communication as DataMPI- Iteration, an MPI-based library, for iterative big data computing. Our performance evaluation shows DataMPI-Iteration can achieve 9X-21X speedup over Apache Hadoop, and 2X-3X speedup over Apache Spark for PageRank and K-means. 展开更多
关键词 iterative computation datampi SPARK Hadoop MapReduce
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部