摘要
针对大部分频繁子图挖掘算法,基于无向图而不适用于更具有实际意义的有向图的挖掘的现状,通过对无向图挖掘算法gSpan中编码结构的扩展,采用改进的规范形式,使编码适用于有向图领域。并使用针对有向图的DADI++存储结构来存储图集,简化了数据访问操作的代价。另外在挖掘中使用Hash表存储同构图的Hash地址和支持度,避免对图集的重复扫描和直接的同构测试。在实际数据集上运行的实验结果表明提出的Dspan算法是正确的,并比FFSM算法效率更高。
For most of the frequent subgraph mining algorithms are based on the undirected graphs and areun- suitable for the more meaningful study on directed graphs, through the expansion of the gSpan code structure, the improved canonical form made the code apply to the directed graphs. It simplified the cost of data access operations by using the DADI ++ storage structure. It stored the Hash address and supporting in the Hash table without scan- ning the graph repeatedly and avoided the direct isomorphism testing in the mining. The experimental result on real datasets show the new proposed algorithm Dspan is correct and better than FFSM in efficiency of mining.
出处
《科学技术与工程》
北大核心
2012年第20期5060-5065,共6页
Science Technology and Engineering