期刊文献+

基于MapReduce框架下的复杂网络社团发现算法 被引量:5

The community detection from networks of algorithm in Map Reduce framework
下载PDF
导出
摘要 随着社会网络数据的增加,社团发现获得来自学术界和工业界的大量关注,是因为它在现实世界中有许多的实际应用。格文-纽曼(Girvan-Newman,GN)是现今最流行的算法之一,但在大型网络上由于需要计算网络中每对节点之间的最短路径而产生了相应的局限性。为此,利用Map Reduce模型,提出了一种并行版本的GN算法来支持大规模网络的新方法,称之为最短路径之间的Map Reduce算法(Shortest Path Betweenness Map Reduce Algorithm,SPB-MRA)。此外,还提出了一个近似技术,进一步加快社区检测过程。在Hadoop上利用开源平台Map Reduce框架实现了SPB-MRA算法。结果表明,随着reducer数量的增加时间呈线性减小,并且引入了一种近似技术可以忽略误差。 Community detection from social network data gains much attention from academia and industry since it has many real-world applications. The Girvan-Newman (GN)algorithm is one of the most popular algorithms. But it has limitations in supporting large-scale networks since it needs to calculate the shortest path between every pair of nodes in a network. This paper proposes a new algorithm called Shortest Path Betweenness MapReduce Algorithm (SPB-MRA), which utilizes the MapReduce model. It suggests an approximation technique to further speed up community detection processes. It implements SPB-MRA on Hadoop, which is opensource platform for MapReduce. The results show that elapsed time decreases almost linearly as the number of reducers increases and the approximation technique introduces negligible errors.
作者 于静雯 杨冰
出处 《微型机与应用》 2014年第22期74-76,共3页 Microcomputer & Its Applications
基金 国家自然科学基金(61205200 61372024) 浙江省自然科学基金(LY12F01005 LY12F03003)
关键词 HADOOP MAPREDUCE 社区检测 GN算法 SPB—MRA Hadoop MapReduce community detection GN algorithm SPB-MRA
  • 相关文献

参考文献8

  • 1NEWMAN M E,GIRVAN M.Finding and evaluating community structure in networks[J].Physical Review E,2004,69(2):026113-1-026113-15.
  • 2DEAN J,GHEMAWAT S.Map Reduce:simplified data processing on large clusters[C].Communications of the ACM,2008,51(1):107-113.
  • 3CHAIKEN R,JENKINS B,LARSON P.-A°,et al.Scope:easy and efficient parallel processing of massive data sets[C].Proceedings of the VLDB Endowment,2008,1(2):1265-1276.
  • 4COHEN J,DOLAN B,DUNLAP M,et al.Mad skills:new analysis practices for big data[C].Proceedings of the VLDB Endowment,2009,2(2):1481-1492.
  • 5Hadoop Apache.Software Foundation.(2013-08-01)[2014-07-01].http://hadoop.apache.org.
  • 6Zeng Zengfeng,Wu Bin,Zhang Tiantian.A multi-source message passing model to improve the parallelism efficiency of graph mining on Map Reduce[C].Proceedings of 2012 IEEE International Parallel and Distributed Processing Symposium Workshops&Ph D Forum(IPDPSW),2012:2019-2025.
  • 7Stanford large network dataset collection[EB/OL].[2013-08-01].http://snap.stanford.edu/data.
  • 8Han Jiamei,KAMBER M,Pei Jian.Data mining:concepts and techniques(2nd edition).Morgan Kaufmann,2006.

同被引文献38

引证文献5

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部