期刊文献+

基于邻域k-核的社区模型与查询算法

Community Model and Query Algorithm Based on Neighborhood k-core
下载PDF
导出
摘要 现实生活中的网络通常存在社区结构,社区查询是图数据挖掘的基本任务.现有研究工作提出了多种模型来识别网络中的社区,如基于k-核的模型和基于k-truss的模型.然而,这些模型通常只限制社区内节点或边的邻居数量,忽略了邻居之间的关系,即节点的邻域结构,从而导致社区内节点的局部稠密性较低.针对这一问题,将节点的邻域结构信息融入k-核稠密子图中,提出一种基于邻域连通k-核的社区模型,并定义了社区的稠密度.基于这一新模型,研究了最稠密单社区查询问题,即返回包含查询节点集且具有最高稠密度的社区.在现实生活图数据中,一组查询节点可能会分布在多个不相交的社区中.为此,进一步研究了基于稠密度阈值的多社区查询问题,即返回包含查询节点集的多个社区,且每个社区的稠密度不低于用户指定的阈值.针对最稠密单社区查询和基于稠密度阈值的多社区查询问题,首先定义了边稠密度的概念,并提出了基于边稠密度的基线算法.为了提高查询效率,设计了索引树和改进索引树结构,能够支持在多项式时间内输出结果.通过与基线算法在多组数据集上的对比,验证了基于邻域连通k-核的社区模型的有效性和所提出查询算法的效率. Real-world networks often exhibit community structures,and community query is a fundamental task in graph data mining.Existing studies introduced various models to identify communities within networks,such as k-core based models and k-truss based models.Nevertheless,these models typically confine themselves to constraining the number of neighbors of nodes or edges within a community,disregarding the relationships between these neighbors,namely,the neighborhood structure of the nodes.Consequently,the localized density of nodes within communities tends to be low.To address this limitation,this study integrates the information regarding the neighborhood structure of nodes into the k-core dense subgraph model,thereby introducing a community model based on neighborhood k-core and defining the density of a community.Based on the novel model,this study investigates the densest single community query problem which outputs the community containing the query node set with the highest community density.In real-life networks,the query nodes may be distributed across multiple disjoint communities.To this end,this study further works on the problem of multi-community query based on a density threshold.This entails returning multiple communities that encompass the query node set,with each community demonstrating a density no lower than the user-specified threshold.For the problem of the densest single community query and the multi-community query based on a density threshold,this study introduces the concept of edge density with which the basic algorithms are proposed.To improve the efficiency,the index tree and the enhanced index tree structures are devised to support outputting results in polynomial time.The effectiveness of the community model based on neighborhood k-core and the efficiency of query algorithms are demonstrated through comparative analyses against basic algorithms using several different datasets.
作者 张琦 程苗苗 李荣华 王国仁 ZHANG Qi;CHENG Miao-Miao;LI Rong-Hua;WANG Guo-Ren(School of Computer Science and Technology,Beijing Institute of Technology,Beijing 100081,China)
出处 《软件学报》 EI CSCD 北大核心 2024年第3期1051-1073,共23页 Journal of Software
基金 国家重点研发计划(2021VEB3301301) 国家自然科学基金(62072034,U2241211) 中国博士后科学基金(2023M730251,2023TQ0026)。
关键词 社区搜索 邻域结构 k-核子图 community search neighborhood structure k-core subgraph
  • 相关文献

参考文献2

二级参考文献46

  • 1Zhang Tian,Ramakrishnan R,Livny M.Birch:A New Data Clustering Algorithm and Its Applications.Data Mining and Knowledge Discovery,1997,1(2):141-182.
  • 2MacQueen J.Some Methods for Classification and Analysis of Multivariate Observations//Proc of the 5th Berkeley Symposium on Mathematical Statistics and Probability.Berkeley,USA,1967,Ⅰ:281-297.
  • 3Ester M,Riegel H P,Sander J,et al.A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise//Proc of the 2nd International Conference on Knowledge Discovery and Data Mining.Portland,USA,1996:291-316.
  • 4Ding C,He Xiaofei.k-Nearest-Neighbor Consistency in Data Clustering:Incorporating Local Information into Global Optimization//Proc of the ACM Symposium on Applied Computing.Nicosia,Cy-prus,2004:584-589.
  • 5Fr(a)nti P,Virmajoki O,Hautam(a)ki V.Fast Agglomerative Clustering Using a k-Nearest Neighbor Graph.IEEE Trans on Pattern Analysis and Machine Intelligence,2006,28(11):1875-1881.
  • 6Zhu Qiaoming,Li Jnnhui,Zhou Guedong,et al.A Novel Hierarchical Document Clustering Algorithm Based on a kNN Connection Graph//Proc of the 21st International Conference on the Computer Processing of Oriental Languages.Singapore,Singapore,2006:120-130.
  • 7Teng S H,Yao F F.k-Nearest-Neighbor Clustering and Percolation Theory.Algorithmica,2007,49(3):192-211.
  • 8刘大有,刘杰,金弟.基于k最近邻划分的聚类算法研究//中国人工智能进展2007.哈尔滨,2007:169-174.
  • 9Luxburg U V.A Tutorial on Spectral Clustering.Statistics and Computing,2007,17(4):395-416.
  • 10Newman M E,Girvan M.Finding and Evaluating Community Structure in Networks[EB/OL].[2004-02-26].http://www.neibi.org/gateway/paf/newman%20community%20struct%20networks %20phys%20rev.pdf.

共引文献202

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部