期刊文献+

一种基于Sketch的Top-k紧密中心性快速搜索算法 被引量:2

A Fast Sketch-Based Approach of Top-k Closeness Centrality Search on Large Networks
下载PDF
导出
摘要 在大数据的时代背景下,由于网络数据(network data)能有效简洁地描述社交网络、电子商务、医疗记录、在线教育等多种应用中各类复杂关系,越来越受到工业界和学术界的关注.在社交网络分析任务中,一个基本操作是从网络中发现重要程度前k大的节点.紧密中心性(closeness centrality)是一种常见的节点重要性刻画指标,它用节点在网络中心的程度来反映节点的重要性.用紧密中心性衡量节点重要性进行节点搜索的问题称为top-k紧密中心性搜索问题.然而,传统的精确算法由于其多项式级别的复杂度无法高效地扩展到大规模的网络数据上.近来,研究人员提出了近似算法,通过牺牲结果精度来获得性能提升.通过分析发现,目前存在的近似算法虽然性能得到了有效提升,但是结果精度牺牲过大.为了解决这个问题,该文设计了一种新颖的近似算法,叫做基于Sketch的紧密中心性搜索算法.此近似算法应用了一个全新的计算方式,利用Sketch估计同一距离的邻居数目,然后得到近似的最短距离之和,最终得到各个节点的紧密中心性的估计值.此算法的时间复杂度为O(mt Dmax),其中t是常数,Dmax是网络直径,m是网络边数.根据实际社交网络的小世界现象的特性,此近似算法基本是个线性算法.最后,相比于目前存在的精确算法和近似算法,该文通过全面的实验验证了基于Sketch的紧密中心性搜索算法在时间性能和结果精度等两方面的优势. The advanced development of various technologies on social network, e-commerce and online education has contributed to an increasing amount of large-scale network data. Among all sorts of network analysis tasks, one basic task is to search important nodes in a network. Closeness centrality is one of the popular metrics which measure the importance of a node in a network. Based on the closeness centrality, the basic task is called top-k closeness centrality search. However, the existing exact approaches cannot process large-scale networks because of their polynomial time complexity. Recently, some approximation algorithms are proposed, which achieve high performance by sacrificing the precision of results. But according to our study, we find that the loss of the precision of results is too much. To improve the precision of results while maintaining the high performance, in this paper, we propose a Sketch-based approximation algorithm for fast searching top-k closeness centrality in a large-scale network. The new algorithm is developed based on a new computation method which calculates the centrality by estimating the number of nodes within a certain distance by a data structure called FM-Sketch. The new algorithm has time complexity O(mtDmax), where t is a constant, Dmax is the diameter of a network and m is the number of edges in a network. With the small-world phenomenon assumption, the Sketch-based algorithm is a linear algorithm. Finally, we compare our Sketch-based algorithm with the state-of-the-art exact and approximation algorithms through extensive experiments. The results demonstrate the advantages of the new solution.
出处 《计算机学报》 EI CSCD 北大核心 2016年第10期1965-1978,共14页 Chinese Journal of Computers
基金 国家自然科学基金(61272155 61572039) 国家"九七三"重点基础研究发展规划项目基金(2014CB340405) 深圳政府研究项目(JCYJ20151014093505032)资助~~
关键词 紧密中心性 图算法 近似算法 图分析 社交网络 closeness centrality graph algorithm approximation algorithm graph analysis social network
  • 相关文献

参考文献4

二级参考文献44

共引文献110

同被引文献10

引证文献2

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部