期刊文献+

基于分区的频繁子树挖掘算法研究

Research of frequent subtree mining algorithm based on partition
下载PDF
导出
摘要 针对基于模式增长原理的嵌入式子树挖掘算法——TreeGrowth(TG)算法挖掘子树过大与内存消耗大缺点,在分区挖掘思想的基础上,提出了一种新算法——PTG(partition tree growth)算法。PTG算法将数据库划分成多个分区,先用TG算法进行挖掘,得到每个分区的局部频繁子树。根据全局支持数进行筛选,得到全局频繁子树,有效地减少了挖掘的子树,有效地降低了内存的开销。仿真实验结果表明,PTG算法能够解决在大数据集上挖掘时出现内存空间不足的问题,验证了其有效性与健壮性。 The TG (tree growth) algorithm based on pattern growth principle is analyzed, which is mining on a tree occupying too much memory. Based on partition principle, a new algorithm, named PTG (partition tree growth), is put forward. In the PTG algorithm, the database is divided into several partitions, the TG algorithm creates the local frequent subtrees of every partition, and then creates the global frequent subtrees according to the global support value for filtering. The tests show that PTG algorithm can deal with the memory problem while mining large dataset, and work effectively.
作者 李娟 杨珺
出处 《计算机工程与设计》 CSCD 北大核心 2011年第6期2054-2057,共4页 Computer Engineering and Design
关键词 模式挖掘 频繁子树 模式增长 投影 分区挖掘 pattern mining frequent subtree pattern growth projection partition mining
  • 相关文献

参考文献8

二级参考文献84

共引文献27

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部