期刊文献+

BHP:面向BSP模型的负载均衡Hash图数据划分 被引量:5

BHP:BSP Model Oriented Hash Graph Data Partition with Load Balancing
下载PDF
导出
摘要 图数据划分是基于BSP(bulk synchronous parallel)编程模型的大规模图处理系统中一个关键技术问题。传统的图划分技术需要多次迭代,时间复杂度过高,且划分结果不具有图顶点到分区的映射信息,因此这些算法并不适用于BSP模型下的数据划分。提出了一种新的面向BSP模型的负载均衡Hash数据划分算法(balanced Hash partition,BHP)。为了实现各个分区的出边数尽可能均衡,该算法引入了虚拟桶的概念,通过贪婪算法将虚拟桶重组为实际分区,保证了每个实际分区负载均衡,同时数据本地化策略使本分片上的数据尽可能地保留在本节点上,从而减小在数据加载时的数据迁移开销。从三个方面对比了BHP算法和经典Hash算法的性能,结果表明BHP算法能够提高作业的执行效率,减少消息发送的数量,有效解决了经典Hash算法的负载不均衡和分区间交互边过多的问题,当数据量变大时,效果尤为明显。 Graph data partition is a critical technical problem in the large-scale graph processing system based on bulk synchronous parallel (BSP) model. Traditional graph partition technology requires many iterations, the time com- plexity is too high, and the partition results don' t have the mapping information from vertexes to partitions. So those algorithms are not suitable for partitioning graph data for the BSP model based systems. This paper presents a new Hash partition algorithm that implements load balance, called balanced Hash partition (BHP). In order to achieve the balance of the number of out degrees in each partition as much as possible, the concept of virtual bucket is introduced. Then these virtual buckets are reorganized into desired partitions by using a greedy algorithm. In this way, the algorithm can ensure the load balance of each partition. At the same time, the data localization strategy makes the data on the split locate on the corresponding node as much as possible. So, the overhead of data migration during the data loading can be reduced. This paper does some experiments to compare the performance between BHP and the classic Hash algorithm from multiple aspects. The results show that the BHP algorithm can improve the job executing efficiency,reduce the number of sending messages, and effectively solve the problems of load imbalance and too many interac- tive edges among partitions.
出处 《计算机科学与探索》 CSCD 2014年第1期40-50,共11页 Journal of Frontiers of Computer Science and Technology
基金 国家自然科学基金Nos.61033007 61173028 中央高校基本科研业务费专项资金No.N100704001 教育部-中国移动科研基金No.MCM20125021~~
关键词 BSP模型 图划分 分布式系统 负载均衡 虚拟桶 bulk synchronous parallel (BSP) graph partition distributed system load balance virtual bucket
  • 相关文献

同被引文献23

  • 1徐大川,韩继业,杜东雷.关于图划分问题的改进的近似算法[J].应用数学学报,2005,28(4):587-597. 被引量:4
  • 2林皎,陈文光,栗强,郑纬民,张益民.基于图划分的全基因组并行拼接算法[J].计算机研究与发展,2006,43(8):1323-1329. 被引量:5
  • 3CH1NG A. Giraph: Large-scale graph processing infrastructure on Hado.op [C] // Hadoop Summit 2011, Santa Clara, CA, USA. [S.l.:s.n:.], 2011.
  • 4MALEW1CZ G, AUSTEN M H, BIK A J C, et al. Pregel: a system for large-scale graph processing[C]//2010 ACM SIGMOD International Conference on Management of data, June 6-10, 2010, Indianapolis, Indiana, USA. New York: ACM Press, 2010: 135-146.
  • 5KHAYYAT Z, AWARA K, ALONAZI A, et al. Mizan: a system for dynamic load balancing in large-scale graph processing[C]// The 8th ACM European Conference on Computer Systems, April 15-17, 2013, Prague, Czech. New York : ACM Press, 2013: 169-182.
  • 6UGANDER J, BACKSTROM L. Balanced label propagation for partitioning massive graphs [C]// The 6th ACM International Conference on Web Search and Data Mining, February 6-8, 2013, Rome, Italy. New York: ACM Press, 2013: 507-516.
  • 7VAQUERO L, CUADRADO F, LOGOTHETIS D, et al. xDGP: a dynamic graph processing system with adaptive partitioning[J]. Eprint Arxiv, 2013(9).
  • 8BAO N T, SUZUMURA T. Towards highly scalable pregel-basedgraph processing platform with xl0 [C]//The 22nd International Conference on World Wide Web Companion, May 13-17, 2013, Rio de Janeiro, Brazil. Geneva: International World Wide Web Conferences Steering Committee, 2013: 501-508.
  • 9SALIHOGLU S, WIDOM J. GPS: A graph processing systemiC] //The 25th International Conference on Scientific and Statistical Database Management, July 29-31, 2013, Baltimore, Maryland, USA. New York: ACM Press, 2013.
  • 10VALIANT L G. A bridging model for parallel computation [J]. Communications of the ACM, 1990, 33(8): 103-111.

引证文献5

二级引证文献11

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部