摘要
现实世界中,网络节点通常会隶属于多个重叠社区,例如社交网络、文献引用网络等.因此,重叠社区发现在复杂网络分析中具有重要意义,如何高效准确地识别网络中的重叠社区是社区发现研究的难点.提出一种基于密度峰值和社区归属度的重叠社区发现算法.首先,提出一种基于节点直接邻居和间接邻居的节点间距离度量方法.其次,给出密度峰值聚类算法簇中心的局部密度阈值和跟随距离阈值计算方法,根据这两个阈值自动选取簇中心.最后,把密度峰值聚类算法应用到社区发现中,并给出社区归属度的计算方法,根据社区归属度对社区边界节点进行社区归属划分.在人工数据集和真实数据集上的实验表明:该算法能够准确的识别重叠社区结构,且具近似线性的时间复杂度,适用于大规模复杂网络.
In the real world,network nodes usually belong to multiple overlapping communities,such as social networks,document reference networks,and so on.Therefore,overlapping community detection have a great significance in the analysis of complex networks.How to identify overlapping communities in the networks efficiently and accurately is a difficult task in community detection research.In this paper,an overlapping community detection algorithm based on the density peaks clustering(DPC)and community belongingness is proposed.First,a method for measuring the distance between nodes based on direct neighbors and indirect neighbors of nodes is proposed.Second,a new method for calculating the local density threshold and the following distance threshold of the cluster centers in DPC is devised,then the cluster centers automatic selection can be finished according to the two thresholds.Finally,the DPC algorithm is applied to discovering communities.A new method for calculating community belongingness is given as well.According to the community belongingness,the community boundary nodes can be assigned to appropriate communities.Experiments on artificial datasets and real datasets show that the proposed algorithm can accurately identify the overlapping communities and has near linear time complexity,which is suitable for large-scale social networks analysis.
作者
郭昆
彭胜波
张瑛瑛
陈羽中
GUO Kun;PENG Sheng-bo;ZHANG Ying-ying;CHEN Yu-zhong(College of Mathematics and Computer Sciences,Fuzhou University,Fuzhou 350116,China;Fujian Provincial Key Laboratory of Network Computing and Intelligent Information Processing,Fuzhou 350116,China;Key Laboratory of Spatial Data Mining&Information Sharing,Ministry of Education,Fuzhou 350116,China;Power Science and Technology Corporation State Grid Information&Telecommunication Group,Fuzhou 350003,China)
出处
《小型微型计算机系统》
CSCD
北大核心
2019年第5期1127-1136,共10页
Journal of Chinese Computer Systems
基金
国家自然科学基金项目(61300104
61300103
61672158)资助
福建省高校杰出青年科学基金项目(JA12016)资助
福建省高等学校新世纪优秀人才支持计划项目(JA13021)资助
福建省杰出青年科学基金项目(2014J06017
2015J06014)资助
福建省科技创新平台计划项目(2009J1007
2014H2005)资助
福建省自然科学基金项目(2013J01230
2014J01232)资助
福建省高校产学合作项目(2014H6014
2017H6008)资助
关键词
重叠社区
密度峰值
节点间距离
簇中心自动选取
社区归属度
overlapping community
density peaks
distance between nodes
cluster centers automatic selection
community belongingness