期刊文献+

从图数据库中挖掘频繁跳跃模式 被引量:10

Mining Frequent Jump Patterns from Graph Databases
下载PDF
导出
摘要 很多频繁子图挖掘算法已被提出.然而,这些算法产生的频繁子图数量太多而不能被用户有效地利用.为此,提出了一个新的研究问题:挖掘图数据库中的频繁跳跃模式.挖掘频繁跳跃模式既可以大幅度地减少输出模式的数量,又能使有意义的图模式保留在挖掘结果中.此外,跳跃模式还具有抗噪声干扰能力强等优点.然而,由于跳跃模式不具有反单调性质,挖掘它们非常具有挑战性.通过研究跳跃模式自身的特性,提出了两种新的裁剪技术:基于内扩展的裁剪和基于外扩展的裁剪.在此基础上又给出了一种高效的挖掘算法GraphJP(an algorithm for mining jump patterns from graph databases).另外,还严格证明了裁剪技术和算法GraphJP的正确性.实验结果表明,所提出的裁剪技术能够有效地裁剪图模式搜索空间,算法GraphJP是高效、可扩展的. Many algorithms on subgraph mining have been proposed. However, the number of frequent subgraphs generated by these algorithms may be too large to be effectively explored by users, especially when the support threshold is low. In this paper, a new problem of mining frequent jump patterns from graph databases is proposed. Mining frequent jump patterns can dramatically reduce the number of output graph patterns and still capture interesting graph patterns. Futhermore, jump patterns are robust against noise and dynamic changes in data. However, this problem is challenging due to the underlying complexity associated with frequent subgraph mining as well as the absence of Apriori property for jump patterns. By exploring the properties of jump patterns, two novel effective pruning techniques are proposed: Internal-Extension-Based pruning and external-extension-based pruning. Based on the proposed pruning techniques, an efficient algorithm GraphJP is presented for this new problem. It has been theoretically proven that the novel pruning techniques and the proposed algorithm are correct. Extensive experimental results demonstrate that the novel pruning techniques are effective in pruning the unpromising parts of search space, and GraphJP is efficient and scalable in mining frequent jump patterns.
出处 《软件学报》 EI CSCD 北大核心 2010年第10期2477-2493,共17页 Journal of Software
基金 国家自然科学基金Nos.60773063 60903017 国家重点基础研究发展计划(973)No.2006CB303000 NSFC/RGC联合资助项目No.60831160525~~
关键词 数据挖掘 图挖掘 图数据库 频繁子图 跳跃模式 data mining graph mining graph database frequent subgraph jump pattern
  • 相关文献

参考文献14

  • 1Inokuchi A,Washio T,Motoda H.An apriori-based algorithm for mining frequent substructures from graph data.In:Cheng M,Yu PS,Liu B,eds.Proc.of the 4th European Conf.on Principles of Data Mining and Knowledge Discovery.Lyon:Springer-Verlag,2000.13-23.
  • 2Kuramochi M,Karypis G.Frequent subgraph discovery.In:Cercone N,Lin TY,Wu X,eds.Proc.of the 1st IEEE Int'l Conf.on Data Mining.San Jose:IEEE Computer Society,2001.313-320.
  • 3Yan X,Han J.gSpan:Graph-Based substructure pattern mining.In:Aggrawal R,Dittrich K,Ngu AH,eds.Proc.of the 2nd IEEE Int'l Conf.on Data Mining.Maebashi:IEEE Computer Society,2002.721-724.
  • 4Borgelt C,Berhold MR.Mining molecular fragments:Finding relevant substructures of molecules.In:Aggrawal R,Dittrich K,Ngu AH,eds.Proc.of the 2nd IEEE Int'l Conf.on Data Mining.Maebashi:IEEE Computer Society,2002.51-58.
  • 5Huan J,Wang W,Prins J.Efficient mining of frequent subgraphs in the presence of isomorphism.In:Wu X,Tuzhilin A,eds.Proc.of the 3rd IEEE Int'l Conf.Data Mining.Melbourne:IEEE Computer Society,2003.549-552.
  • 6Nijssen S,Kok JN.A quickstart in frequent structure mining can make a difference.In:Kim W,Kohavi R,Gehrke J,DuMouchel W,eds.Proc.of the 10th ACM SIGKDD Int'l Conf.on Knowledge Discovery and Data Mining.Seattle:ACM,2004.647-652.
  • 7Yan X,Han J.Closegraph:Mining closed frequent graph patterns.In:Getoor L,Senator TE,Domingos P,Faloutsos C,eds.Proc.of the 9th ACM SIGKDD Int'l Conf.on Knowledge Discovery and Data Mining.Washington:ACM,2003.286-295.
  • 8Huan J,Wang W,Prins J,Yang J.Spin:Mining maximal frequent subgraphs from graph databases.In:Kim W,Kohavi R,Gehrke J,DuMouchel W,eds.Proc.of the 10th ACM SIGKDD Int'l Conf.on Knowledge Discovery and Data Mining.Seattle:ACM,2004.581-586.
  • 9Thomas LT,Valluri SR,Karlapalem K.Margin:Maximal frequent subgraph mining.In:Proc.of the 6th IEEE Int'l Conf.Data Mining.Hong Kong:IEEE Computer Society,2006.1097-1101.http://www.computer.org/portal/web/csdl/doi/10.1109/ ICDM.2006.102.
  • 10http://dtp.nci.nih.gov/docs/aids/searches/list.html.

同被引文献193

引证文献10

二级引证文献206

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部