期刊文献+

频繁子图挖掘算法gSpan的设计与实现

Design and Implementation of A Frequent Subgraph Mining Algorithm gSpan
下载PDF
导出
摘要 由于大部分图挖掘算法都需要利用频繁子图,频繁子图挖掘逐渐成为了数据挖掘领域中的热点研究内容。目前,很多高效的频繁子图挖掘算法已经被提出。其中,gSpan算法是目前公认的最好的频繁子图挖掘算法。然而,在化合物数据集上,还可以利用化合物的特殊结构进一步优化gSpan算法的性能。文献利用了化合物分子结构的对称性和原子类型分布的不均衡性,提出了一些新的优化策略,进一步改进了gSpan的性能。鉴于gSpan算法在图挖掘领域乃至整个数据挖掘领域的重要性,设计并实现gSpan算法。同时,采用文献[4]中的优化策略,进一步提高gSpan算法在化合物数据集上的运行效率。 Since most of the graph mining algorithms are needed to make frequent subgraph,frequent subgraph mining is gradually becoming the hot spot in the field of research.At present,many efficient frequent subgraph mining algorithms have been proposed.Among them,gSpan algorithm is currently accepted as the best frequent subgraph mining algorithm.However,in the compound datasets,the performance of gSpan algorithm based on the special structure could be further optimized.The paper uses the symetry of the molecular structure of compounds and the unequilibrium of the distribution of atomic types,and puts forward some new optimization strategy,so as to further improve the performance of gSpan algorithm.Because gSpan algorithm is very vital in graph mining areas and the entire data mining field,this paper designes and implementes gSpan algorithm.Meanwhile,the paper also prepares to adopt the optimization strategy in the literature[4],further improves the gSpan algorithm operation efficiency in compound datasets.
作者 郭玉林 刘勇 GUO Yulin,LIU Yong(School of computer science and technology,Hei Longjiang University,Harbin 150080,China)
出处 《智能计算机与应用》 2011年第3期55-57,共3页 Intelligent Computer and Applications
基金 国家自然科学基金资助项目(60973081) 黑龙江省自然科学基金项目(F201011) 黑龙江省教育厅科学技术研究面上项目(11551352 12511401)
关键词 gSpan 频繁子图 DFS编码 词典序 gSpan Frequent Subgraphs DFS Code Dictionary Order
  • 相关文献

参考文献2

  • 1JAHN K,KRAMER S.Optimizing gSpan for molecular datasets[]..
  • 2KURAMOCHI M,KARYPIS G.An efficient algorithm for dis-covering frequent subgraphs[]..2002

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部