期刊文献+

面向计算机集群系统的FP-Growth算法的并行计算 被引量:1

Parallel FP-Growth Algorithm on PC Cluster
下载PDF
导出
摘要 FP-Growth是频繁模式挖掘的经典算法,能够在不产生候选集的情况下生成所有的频繁模式,效率与Apri-ori算法相比有巨大提高,然而FP-Growth算法在挖掘频繁模式过程中需要递归构建大量的条件FP-tree,并分别针对这些条件FP-tree进行挖掘,时间及空间效率不高,在实际应用中存在很大局限性。计算机集群是由多台普通计算机设备通过特定方式结合在一起构成的并行处理系统,属于分布式计算环境,具有计算能力强大、性价比高、灵活等优势。本文提出一种面向计算机集群的并行挖掘算法Gridify FP-Growth,该算法以FP-Growth为基础,通过任务划分的形式,将计算任务分配到计算机集群中各个计算节点上执行,充分利用各个节点的计算资源,最后汇总各节点的计算结果。实验证明Gridify FP-Growth算法不会牺牲计算的准确性,并可以大幅度缩短计算时间,有效缓解计算大规模数据库时的内存压力。 FP-Growth is the most popular algorithm for frequent patterns mining, which can produce all frequent patterns without generating candidate item sets. FP-Growth has better performance than previously reported algorithms such as Apriori. Nevertheless, the great amount of conditional pattern base and conditional FP-tree recursively generated during mining frequent patterns limits practical feasibility of FP-Growth algorithm when facing large scale data warehouse. Further performance improvement can be expected from parallel execution. PC cluster is a group of PC connected together through definite ways. It is a distributed computing environment and has some advantages such as great computing ability, flexibility and so on. We propose a new parallel algorithm named Gridify FP-Growth to implement on PC cluster. Gridify FP-Growth is based on FP-Growth algorithm, by allocating jobs to the nodes within the cluster to take full advantage of computing resource of each node. After that, the sub - result from each node will be combined to a total result. Experimental results show that Gridify FP-Growth can dramatically reduce the execution time as well as relieve the space pressure.
作者 陈敏
出处 《中国管理信息化》 2009年第15期36-38,共3页 China Management Informationization
关键词 频繁模式 FP—Growth 并行计算 计算机集群 Frequent Patterns FP-Growth Parallel Execution PC Cluster
  • 相关文献

参考文献6

  • 1Iko Pramudiono,Masaru Kitsuregawa.Parallel FP-Growth on PC Cluster[].Proc of the International Conference on Internet Com-puting.2003
  • 2Artur Bykowski,Christophe Rigotti.A Condensed Representationto Find Frequent Patterns[].Proc of theth ACMSIGACT-SIG-MOD-SIGART Symp on Principles of Database System(PODS).2001
  • 3Agrawal R,Srikant R.Fast Algorithms for Mining Association Rules[].Proceedings of the th International Conference on Very Large Databases(VLDB’).1994
  • 4Brin S,Motwani R,Silverstein C.Beyond Market Baskets:Generalizing Association Rules to Correlations[].Proceedings of the ACM-SIGMOD International Conference On Management of Data(SIGMOD’).1997
  • 5Agrawal R,Srikant R.Mining sequential patterns[].Proceedings of the th International Conference on Data Engineering (ICDE’).1995
  • 6Han J,Pei J,Yin Y.Mining Frequent Patterns Without Candidate Generation[].Proc ACM-SIGMOD.2000

同被引文献15

引证文献1

二级引证文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部