期刊文献+

集群系统中的FP-Growth并行算法 被引量:8

FP-Growth Parallel Algorithm in Cluster System
下载PDF
导出
摘要 针对FP-Growth算法面临大规模数据库时空效率不高的问题,提出一种面向计算机集群的并行算法。采用投影方法直接寻找频繁项的条件数据库,将挖掘条件数据库的工作分化成若干独立的子任务,分配到集群中的节点上并行实现,由中央节点汇总结果并输出。结果证明,该算法不仅能够提高计算速度,解决数据库规模过大时内存溢出的情况,且具有良好的延展性。 When the dataset size is huge, both the memory usage and computational cost of FP-Growth algorithm are expensive. This paper proposes a parallel algorithm, which is designed to run on the PC cluster. This algorithm finds all the conditional pattern bases of frequent items by the projection method. It splits the mining task into number of independent sub-tasks, executes these sub-tasks in parallel on nodes and aggregates the sub-results back for the final result. Experiments show that this parallel algorithm not only can accelerate the computational speed, avoids the memory overflow, but also achieves much better scalability than the FP-Growth algorithm.
作者 陈敏 李徽翡
出处 《计算机工程》 CAS CSCD 北大核心 2009年第20期71-72,75,共3页 Computer Engineering
基金 国家自然科学基金资助项目"高维稀疏数据聚类研究"(70771007)
关键词 FP-GROWTH算法 计算机集群 并行算法 FP-Growth algorithm PC cluster parallel algorithm
  • 相关文献

参考文献5

  • 1Agrawal R, Imielinski T, Swami A N. Mining Association Rules Between Sets of Items in Large Databases[C]//Proc. of the ACM SIGMOD International Conference on Management of Data. Washington D.C., USA: ACM Press, 1993: 207-216.
  • 2Han Jiawei, Pei Jian, Yin Yiwen. Mining Frequent Patterns Without Candidate Generation[C]//Proc. of ACM-SIGMOD International Conference on Management of Data. Dallas, USA: ACM Press, 2000: 1-12.
  • 3ZaIane O R, Mohammad E H, Lu P. Fast Parallel Association Rule Mining Without Candidacy Generation[C]//Proc. of the 1st 1EEE International Conference on Data Mining. San Jose, USA: IEEE Computer Society Press, 2001: 665-668.
  • 4Liu Li, Li E, Zhang Yimin, et al. Optimization of Frequent Itemset Mining on Multiple-core Processor[C]//Proc. of the 33rd International Conference on Very Large Data Bases. Vienna, Austria: VLDB Endowment, 2007:1275-1285.
  • 5郑斌,李涓子.一种基于FP-Growth的改进算法[J].平顶山工学院学报,2008,17(4):9-12. 被引量:2

二级参考文献6

  • 1阮幼林,李庆华,刘干.最大频繁模式的快速挖掘与更新算法[J].计算机工程与应用,2005,41(24):23-26. 被引量:3
  • 2胡宏银.1999年中国智能自动化学术会议论文集[C].北京:清华大学出版社,1999:812-817.
  • 3R Groth.侯迪等译.数据挖掘-构筑企业竞争优势[M].西安:西安交通大学出版社,2001.
  • 4J Han,J Pei, Y Yin. Mining Frequent Patterns Without Candidate Generation[C]. In: Proe ACM- SIGMOD,Dallas,TX,2000 - 05.
  • 5[加]Jiawei Han Micheline Kamber,范明,孟小峰等译.数据挖掘:概念与技术[M].北京:机械工业出版社,2001..
  • 6杨红菊,梁吉业.一种挖掘频繁项集和频繁闭包项集的算法[J].计算机工程与应用,2004,40(13):176-178. 被引量:5

共引文献1

同被引文献74

  • 1邱勇,兰永杰.高效FP-TREE创建算法[J].计算机科学,2004,31(10):98-100. 被引量:4
  • 2谈克林,孙志挥.一种FP树的并行挖掘算法[J].计算机工程与应用,2006,42(13):155-157. 被引量:10
  • 3Han Jiawei,Kamber Micheline,范明,孟小峰,等译.数据挖掘概念与技术[M].北京:机械工业出版社,2007:424-479.
  • 4Agrawal R,Imielinski T,Swami A,et al.Mining Association rule between sets of items in large database[J].SIGMOD93,1993:207-216.
  • 5Agrawal R, Imielinski T, Swami A. Mining association rules between sets of items in large databases [ C ]//Proceedings of ACM SIGMOD In- ternational Conference on Management of Date, 1993:207 - 216.
  • 6Agrawal R, Srikant R. Fast algorithms for mining association rules [C]//Proceedings of the 1994 International Conference on Very Large Data Bases, 1994:487 - 499.
  • 7Han J, Pei J, Yin Y. Mining Frequent Patterns Without Candidate Gen- eration[ C]//Proceedings of ACM SIGMOD International Conference on Management of Data,2000 : 1 - 12.
  • 8Pramudiono I, Kitsuregawa M. Parallel FP-Growth on PC cluster[ C ]// Proceedings of International Conference on Internet Computing,2003 : 467 - 473.
  • 9Zaiane O R, Mohammad E H, Lu P. Fast parallel association rule mining without candidacy generation[ C]//Proceedings of 1st IEEE International Conference on Data Mining,2001 : 665 - 668.
  • 10Liu L, Li E, Zhang Y, et al. Optimization of frequent item-set mining on multiple-core processors [ C ]//Proceedings of 33 rd International Con- ference on Very Large Data Bases,2007:1275-1285.

引证文献8

二级引证文献41

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部