期刊文献+

大规模网络图中4节点子图数量快速估计算法

A Fast Mining Algorithm for 4-Node Graphlets in Large Graphs
下载PDF
导出
摘要 针对大规模网络图中存在的大量4节点子图数量难以精确统计分析的问题,提出了一种大规模网络图中4节点子图数量快速估计算法(SmartMoss)。该算法通过随机变量方差分析技术对比3路径采样算法(3PS)和中心3路径采样算法(C3PS)两种前沿算法的估计误差得出其各自不同的适用范围,进而通过计算被测网络图权重密度分布与误差实时选择使用3PS算法或C3PS算法对网络图中4节点子图进行快速采样,通过采样比例混合3PS算法与C3PS算法的估计结果实现对网络图中各4节点子图出现数量的快速估计。实验结果表明,在同等估计误差下提出的SmartMoss算法比已有3PS算法和C3PS算法快10倍以上。SmartMoss算法可以实现对大规模网络图中4节点子图数量进行快速准确的估计,同时为网络社团演化和恶意代码检测等实际应用提供一定的理论参考。 A fast method(SmartMoss)for efficiently estimating the frequencies of 4-node graph-lets in large-scale networks is proposed to address the challenge of exactly counting the frequencies of 4-node graphlets occurring in large-scale networks.SmartMoss compares the estimation errors of the two state-of-the-art algorithms,3PS(3-Path Sampling)and C3PS(Center 3-Path Sampling),through random variables variance analysis to obtain their different application ranges,and then selects either 3PS algorithm or C3PS algorithm for 4-node graphlets sampling by calculating the distribution of network weight densities and real-time errors,to achieve quick estimation of the frequencies of 4-node graphlets by combing the estimation results of 3PS and C3PS based on the sampling proportions.Experimental results show that SmartMoss is more than 10 times faster than the state-of-the-art algorithms 3PS and C3PS under the same estimation error.SmartMoss efficiently and accurately estimates the frequencies of 4-node graph-lets for large network graphs,and provides theoretical reference for practical applications such as network community evolution and malicious code detection.
作者 覃遵颖 孙雨 李国栋 齐怀睿 陶敬 QIN Zunying;SUN Yu;LI Guodong;QI Huairui;TAO Jing(Ministry of Education Key Lab for Intelligent Networks and Network Security,Xi’an Jiaotong University, Xi’an 710049,China;Center of Network,Xi’an Jiaotong University,Xi’an 710049,China;School of Electronic and Information Engineering,Xi’an Jiaotong University,Xi’an 710049,China)
出处 《西安交通大学学报》 EI CAS CSCD 北大核心 2018年第12期57-62,共6页 Journal of Xi'an Jiaotong University
基金 深圳市基础研究计划资助项目(JCYJ20160229195940462) 陕西省自然科学基础研究计划资助项目(2016JQ6034 2017JM6095) 浙江省自然科学基金资助项目(LGG18F020016)
关键词 网络图 采样算法 子图数量估计 network graph sampling algorithm graphlet counting
  • 相关文献

参考文献3

二级参考文献62

  • 1汪卫,周皓峰,袁晴晴,楼宇波,施伯乐.基于图论的频繁模式挖掘[J].计算机研究与发展,2005,42(2):230-235. 被引量:17
  • 2Zaki M J. Efficiently mining frequent trees in a forest[C]// Proc 2002 Int'l Conf Knowledge Discovery and Data Mining Edmonton. Alberta, Canada, 2002 : 135- 139.
  • 3Kuramochi M, Karypis G. Frequent subgraph discovery [C]//. Proc 2001 Int'l Conf IEEE International Conference on Data Mining. San Jose, USA, 2001:313-320.
  • 4Agrawal R, Srikant R. Fast algorithms for mining association rules[C]//. Proc 1994 Int'l Conf Very Large Data Bases. Santiago,1994:487 499.
  • 5Cormen T H, Leiserson C E. Introduction to Algorithms [M]. 2nd ed. Massachusettes: MIT Press, 2001 : 12- 14.
  • 6Yan Xifeng, Han Jiawei. gSpan:graph-based substructure pattern mining, Technical Report IUCDCS-R-2002-2296 [R]. Illinois, 2002 : 721 - 722.
  • 7Pei J, Han J. PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern growth[C]//Proc 2001 Int'l Conf International Conference on Data Engineering. Heidelberg,2001 : 215-224.
  • 8Asai T, Abe K. Efficient substructure discovery from large semistructured data[C]//Proc 2002 Int'l Conf SIAM International Conference on Data Mining. Washington, 2002:225 -227.
  • 9Sun L, Zhang X. Efficient frequent pattern mining on Web logs[C]//Proc 2004 Int'l Conf Asia Pacific Web Conference. Hangzhou, 2004: 533- 542.
  • 10李先通,李建中,高宏.一种高效频繁子图挖掘算法[J].软件学报,2007,18(10):2469-2480. 被引量:35

共引文献26

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部