摘要
密集子图体现了大图中的稠密部分,它是图中具有最高密度的子图,这使得它在事件检测,生物分析和社区发现等方面具有广泛应用和实用价值.现有的密集子图发现方法所使用的图模型描述不够详细,并且发现的密集子图缺乏统计显著性.为了解决以上问题,本文提出了异构属性网络这一新模型,然后在异构属性网络上通过非参数扫描统计和基于(k,Ψ)-核的方法发现高Steiner连通度的统计显著密集子图.首先构建异构属性网络,其包括类型、实体、关系和带有时序关系的属性信息;其次通过历史属性信息计算异构属性网络中每个实体的统计值,形成统计权重网络;然后利用非参数扫描统计方法测量统计权重网络中子图的统计显著性;最后由于此问题是NP-难的,于是提出了基于(k,Ψ)-核的局部扩展的近似统计显著密集子图发现算法.大量基于真实异构属性网络数据的实验结果证明了本文所提出算法的有效性和高效性.
Densest subgraphs(DS) with highest density reflect the densest part of the large graph,which makes it widely used and practical in event detection,biological analysis and community discovery,etc.The description of the graph model used by the existing DS discovery methods is not detailed enough,and the found DS lack statistical significance.To solve above limitations,this paper proposes a newmodel of heterogeneous attribute network(HAN),and then statistically significant densest subgraphs with high Steiner connectivity are found through non-parametric scan statistics and(k,Ψ)-core based methods.Firstly,construct a HAN,which includes types,entities,relations and attributes with temporal relationship;Secondly,calculate the statistical value of each entity in HAN by comparing data with the past and between peers,and form a statistical weighted network;Then,exploit non-parametric scan statistics to measure the statistical significance of subgraphs in statistical weighted network;In the end,since this problem is NP-hard,an approximate statistically significant densest subgraph algorithm based on(k,Ψ)-core and local expansion is proposed.Extensive experiments conducted on large real and HAN demonstrate the efficiency and effectiveness of our approaches.
作者
李源
范晓林
孙晶
赵宇海
LI Yuan;FAN Xiao-lin;SUN Jing;ZHAO Yu-hai(School of Information Science and Technology,North China University of Technology,Beijing 100144,China;School of Computer Science and Engineering,Northeastern University,Shenyang 110819,China)
出处
《小型微型计算机系统》
CSCD
北大核心
2021年第10期2203-2210,共8页
Journal of Chinese Computer Systems
基金
国家自然科学基金青年项目(61902004)资助
国家自然科学基金面上项目(61672041,61772124,61977001)资助
北京市教委科技项目(KM202010009009)资助
北方工业大学科研启动经费资助。
关键词
异构属性网络
密集子图
统计显著性
(k
Ψ)-核
heterogeneous attribute network
densest subgraph
statistically significance
(k
Ψ)-core