期刊文献+

iGraph: an incremental data processing system for dynamic graph 被引量:7

iGraph: an incremental data processing system for dynamic graph
原文传递
导出
摘要 With the popularity of social network, the de- mand for real-time processing of graph data is increasing. However, most of the existing graph systems adopt a batch processing mode, therefore the overhead of maintaining and processing of dynamic graph is significantly high. In this pa- per, we design iGraph, an incremental graph processing sys- tem for dynamic graph with its continuous updates. The con- tribufions of iGraph include: 1) a hash-based graph partition strategy to enable fine-grained graph updates; 2) a vertex- based graph computing model to support incremental data processing; 3) detection and rebalance methods of hotspot to address the workload imbalance problem during incre- mental processing. Through the general-purpose API, iGraph can be used to implement various graph processing algo- rithms such as PageRank. We have implemented iGraph on Apache Spark, and experimental results show that for real life datasets, iGraph outperforms the original GraphX in respect of graph update and graph computation. With the popularity of social network, the de- mand for real-time processing of graph data is increasing. However, most of the existing graph systems adopt a batch processing mode, therefore the overhead of maintaining and processing of dynamic graph is significantly high. In this pa- per, we design iGraph, an incremental graph processing sys- tem for dynamic graph with its continuous updates. The con- tribufions of iGraph include: 1) a hash-based graph partition strategy to enable fine-grained graph updates; 2) a vertex- based graph computing model to support incremental data processing; 3) detection and rebalance methods of hotspot to address the workload imbalance problem during incre- mental processing. Through the general-purpose API, iGraph can be used to implement various graph processing algo- rithms such as PageRank. We have implemented iGraph on Apache Spark, and experimental results show that for real life datasets, iGraph outperforms the original GraphX in respect of graph update and graph computation.
出处 《Frontiers of Computer Science》 SCIE EI CSCD 2016年第3期462-476,共15页 中国计算机科学前沿(英文版)
关键词 big data distributed system in-memory computing graph processing hotspot detection big data, distributed system, in-memory computing, graph processing, hotspot detection
  • 相关文献

参考文献3

二级参考文献29

  • 1Horowitz D, Kamvar S D. The anatomy of a large-scale social search engine. In: Proceedings of the 19th International Conference on World Wide Web. 2010, 431--440.
  • 2Song H, Cho T, Dave V, Zhang Y, Qiu L. Scalable proximity estima- tion and link prediction in online social networks. In: Proceedings of the 9th ACM SIGCOMM Conference on Internet Measurement Con- ference. 2009, 322-335.
  • 3Gao B, Liu T, Wei W, Wang T, Li H. Semi-supervised ranking on very large graphs with rich metadata. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2011, 96-104.
  • 4Baluja S, Seth R, Sivakumar D, Jing Y, Jay Y, Kumar S, Deepak R, Aly M. Video suggestion and discovery for youtube: taking random walks through the view graph. In: Proceedings of the 17th International Con- ference on World Wide Web. 2008, 895-904.
  • 5Zhou T, Kuscsik Z, Liu J, Medo M, Wakeling J R, Zhang Y. Solving the apparent diversity-accuracy dilemma of recommender systems. Pro- ceedings of the National Academy of Sciences, 2010, 107(10): 4511- 4515.
  • 6David L N, Jon K. The link prediction problem for social networks. In: Proceedings of the 12th International Conference on Information and Knowledge Management. 2003, 556-559.
  • 7Shroff G M. A parallel algorithm for the eigenvalues and eigenvec- tors of a general complex matrix. Numerische Mathematik, 1990, 58(1):779-805.
  • 8Ekanayake J, Li H, Zhang B, Gunarathne T, Bae S H, Qiu J, Fox G. Twister: a runtime for iterative MapReduce. In: Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing. 2010, 810-818.
  • 9Bu Y, Howe B, Balazinska M, Ernst M D. HaLoop: efficient iterative data processing on large clusters. Proceedings of the VLDB Endow- ment, 2010, 3(1): 285-296.
  • 10Zaharia M, Chowdhury M, Franklin M J, Shenker S, Stoica I. Spark: cluster computing with working sets. In: Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing. 2010, 1-10.

共引文献17

同被引文献83

引证文献7

二级引证文献22

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部