摘要
随着社会网络数据的增加,社团发现获得来自学术界和工业界的大量关注,是因为它在现实世界中有许多的实际应用。格文-纽曼(Girvan-Newman,GN)是现今最流行的算法之一,但在大型网络上由于需要计算网络中每对节点之间的最短路径而产生了相应的局限性。为此,利用Map Reduce模型,提出了一种并行版本的GN算法来支持大规模网络的新方法,称之为最短路径之间的Map Reduce算法(Shortest Path Betweenness Map Reduce Algorithm,SPB-MRA)。此外,还提出了一个近似技术,进一步加快社区检测过程。在Hadoop上利用开源平台Map Reduce框架实现了SPB-MRA算法。结果表明,随着reducer数量的增加时间呈线性减小,并且引入了一种近似技术可以忽略误差。
Community detection from social network data gains much attention from academia and industry since it has many real-world applications. The Girvan-Newman (GN)algorithm is one of the most popular algorithms. But it has limitations in supporting large-scale networks since it needs to calculate the shortest path between every pair of nodes in a network. This paper proposes a new algorithm called Shortest Path Betweenness MapReduce Algorithm (SPB-MRA), which utilizes the MapReduce model. It suggests an approximation technique to further speed up community detection processes. It implements SPB-MRA on Hadoop, which is opensource platform for MapReduce. The results show that elapsed time decreases almost linearly as the number of reducers increases and the approximation technique introduces negligible errors.
出处
《微型机与应用》
2014年第22期74-76,共3页
Microcomputer & Its Applications
基金
国家自然科学基金(61205200
61372024)
浙江省自然科学基金(LY12F01005
LY12F03003)