期刊文献+

分布共享环境下支持弹性伸缩的图处理框架 被引量:1

Graph Processing Framework Supporting Elastic Scalability in Distributed Shared Environment
下载PDF
导出
摘要 作为大数据处理的一种重要模式,图处理被广泛地应用在机器学习、数据统计和数据挖掘等场景中。在企业级应用中,多种类型的大数据处理框架通常会部署在同一个分布式集群中,其运行环境是开放、共享的,这时图处理需要考虑运算资源动态变化的问题。为了能适应这种动态性,更加充分地利用开放共享环境的资源,图处理框架应该具备弹性伸缩能力。通过调研,发现现有的图处理框架尚未完全实现弹性伸缩。为此,介绍了一种支持弹性伸缩的分布式并行图处理框架SPar Ta G。首先基于任务并行模型定义了图处理任务集及任务模型;其次基于任务迁移机制设计并实现了可动态伸缩的图处理框架;最后设计了一个基于负载均衡的调度算法,实现了动态伸缩的图处理过程。实验结果说明,SPar Ta G的性能与当前流行的开源图处理框架Giraph相近,且具有较好的弹性伸缩能力。 As an important pattern in big data processing, graph processing has been widely used in many kinds of scenarios, such as machine learning, data statistics and data mining, etc. when running enterprise-level applications, various kinds of big-data processing frameworks are usually deployed in the same distributed cluster, so the runtime environment is open and shared. As a result, graph processing should consider the dynamic changes of computing resources.In order to adapt to this dynamics and make good use of computing resources, graph processing framework should have the ability of elastic scaling. However, current graph processing frameworks have not fully realized elastic scaling yet as far as this paper knows. This paper introduces the design and implementation of an elastic scalable parallel graph processing framework, SPar Ta G. SPar Ta G firstly defines the task set and task model in graph processing problem; then designs an elastic scalable framework based on task migration mechanism; and proposes a load- balancing based scheduling algorithm at last. Experiments show that SPar Ta G achieves performance parity with the currently popular open-source Giraph system, and it can run graph job well in an elastic scalable manner.
出处 《计算机科学与探索》 CSCD 北大核心 2016年第7期901-914,共14页 Journal of Frontiers of Computer Science and Technology
基金 国家自然科学基金 Nos.61272154 61121063 国家重点基础研究发展计划(973计划) No.2011CB302604 百度云服务开放平台示范项目~~
关键词 图处理 分布式并行计算 弹性伸缩 任务迁移 graph processing distributed parallel computing elastic scaling task migration
  • 相关文献

参考文献32

  • 1程学旗,靳小龙,王元卓,郭嘉丰,张铁赢,李国杰.大数据系统和分析技术综述[J].软件学报,2014,25(9):1889-1908. 被引量:735
  • 2Lu Xicheng, Wang Huaimin, Wang Ji. Internet virtual computingenvironment-iVCE: concept and architecture[J].Science in China: Series E Information Sciences, 2006, 36(10): 1081-1099.
  • 3Malewicz G, Austern M H, Bik A J, et al. Pregel: a systemfor large-scale graph processing[C]//Proceedings of the 2010ACM SIGMOD International Conference on Managementof Data, Indianapolis, USA, Jun 6-11, 2010. New York, USA:ACM, 2010: 135-146.
  • 4Khayyat Z, Awara K, Alonazi A, et al. Mizan: a system for dynamic load balancing in large-scale graph processing[C]//Proceedings of the 8th ACM European Conference on ComputerSystems, Prague, Czech Republic, Apr 15- 17, 2013.New York, USA: ACM, 2013: 169-182.
  • 5Vaquero L, Cuadrado F, Logothetis D, et al. xDGP: a dynamicgraph processing system with adaptive partitioning[C]//Proceedingsof the 4th Annual Symposium on Cloud Computing,2013.
  • 6Nicoara D, Kamali S, Daudjee K, et al. Managing socialnetwork data through dynamic distributed partitioning[Z].2014.
  • 7Valiant L G. A bridging model for parallel computation[J].Communications of the ACM, 1990, 33(8): 103-111.
  • 8Armstrong J. Programming Erlang: software for a concurrentworld[M]. [S.l.]: Pragmatic Bookshelf, 2007.
  • 9Dutt S. New faster Kernighan-Lin-type graph partitioningalgorithms[C]//Proceedings of the 1993 IEEE/ACM InternationalConference on Computer-Aided Design, Santa Clara,USA, Nov 7-11, 1993. Piscataway, USA: IEEE, 1993: 370-377.
  • 10Fiduccia C M, Mattheyses R M. A linear-time heuristic forimproving network partitions[C]//Proceedings of the 19thConference on Design Automation, Las Vegas, USA, Jun14-16, 1982. Piscataway, USA: IEEE, 1982: 175-181.

二级参考文献11

共引文献734

同被引文献2

引证文献1

二级引证文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部