期刊文献+

分布式并行关联规则挖掘算法研究 被引量:13

RESEARCH ON DISTRIBUTED PARALLEL ASSOCIATION RULE MINING
下载PDF
导出
摘要 关联规则挖掘算法FP-Growth虽然效率比Apriori要快一个数量级,但存在频繁模式树可能过大而内存无法容纳和数据挖掘过程串行处理等两大缺点。提出一种分布式并行关联规则挖掘算法,该算法针对分布式应用数据架构,不需要产生全局FPtree,避免全局FP-tree可能过大而内存无法容纳的问题,算法在各个主要步骤上都实现了并行处理。算法测试结果和分析表明,与传统的关联规则挖掘算法FP-Growth相比,该算法通过多节点分布式并行处理显著提高了执行效率和处理能力。 In association rule mining, though the FP-Growth algorithm is approximately one order of magnitude faster than the Apriori algorithm, but it has two disadvantages: the first is that its frequent pattern tree may be too big to be created in the memory ; the second is its serial processing approach. In this paper we propose a kind of distributed parallel association rule mining algorithm. It is for the distributed applied data framework, does not need to create the global FP-tree so avoids the problem of too big the global FP tree that fills the memory to excess. In all its principal steps the algorithm achieves parallel processing. Test resuh and analysis of the algorithm show that compared with conventional association rule mining algorithm FP-Growth, this one significantly improves the executing efficiency and the processing ability by multi-node distributed parallel processing.
出处 《计算机应用与软件》 CSCD 北大核心 2013年第10期113-115,119,共4页 Computer Applications and Software
基金 江苏省现代教育技术研究项目(2011-R-19470) 江苏省高校自然科学基金项目(11KJD520006)
关键词 数据挖掘 关联规则 频繁模式 并行算法 Data mining Association rule Frequent pattern Parallel algorithm
  • 相关文献

参考文献9

  • 1Agrawal R, Imielinski T, Swami A. Mining association rules between sets of items in large databases [ C ]//Proceedings of ACM SIGMOD In- ternational Conference on Management of Date, 1993:207 - 216.
  • 2Agrawal R, Srikant R. Fast algorithms for mining association rules [C]//Proceedings of the 1994 International Conference on Very Large Data Bases, 1994:487 - 499.
  • 3朱玉全,孙志挥,季小俊.基于频繁模式树的关联规则增量式更新算法[J].计算机学报,2003,26(1):91-96. 被引量:80
  • 4Han J, Pei J, Yin Y. Mining Frequent Patterns Without Candidate Gen- eration[ C]//Proceedings of ACM SIGMOD International Conference on Management of Data,2000 : 1 - 12.
  • 5Pramudiono I, Kitsuregawa M. Parallel FP-Growth on PC cluster[ C ]// Proceedings of International Conference on Internet Computing,2003 : 467 - 473.
  • 6Zaiane O R, Mohammad E H, Lu P. Fast parallel association rule mining without candidacy generation[ C]//Proceedings of 1st IEEE International Conference on Data Mining,2001 : 665 - 668.
  • 7Liu L, Li E, Zhang Y, et al. Optimization of frequent item-set mining on multiple-core processors [ C ]//Proceedings of 33 rd International Con- ference on Very Large Data Bases,2007:1275-1285.
  • 8谈克林,孙志挥.一种FP树的并行挖掘算法[J].计算机工程与应用,2006,42(13):155-157. 被引量:10
  • 9陈敏,李徽翡.集群系统中的FP-Growth并行算法[J].计算机工程,2009,35(20):71-72. 被引量:8

二级参考文献16

  • 1Agrawal R, Imielinski T, Swami A N. Mining Association Rules Between Sets of Items in Large Databases[C]//Proc. of the ACM SIGMOD International Conference on Management of Data. Washington D.C., USA: ACM Press, 1993: 207-216.
  • 2Han Jiawei, Pei Jian, Yin Yiwen. Mining Frequent Patterns Without Candidate Generation[C]//Proc. of ACM-SIGMOD International Conference on Management of Data. Dallas, USA: ACM Press, 2000: 1-12.
  • 3ZaIane O R, Mohammad E H, Lu P. Fast Parallel Association Rule Mining Without Candidacy Generation[C]//Proc. of the 1st 1EEE International Conference on Data Mining. San Jose, USA: IEEE Computer Society Press, 2001: 665-668.
  • 4Liu Li, Li E, Zhang Yimin, et al. Optimization of Frequent Itemset Mining on Multiple-core Processor[C]//Proc. of the 33rd International Conference on Very Large Data Bases. Vienna, Austria: VLDB Endowment, 2007:1275-1285.
  • 5[1]Agrawal R, Imielinski T, Swami A. Mining association rules between sets of items in large databases. In: Proceedings of ACM SIGMOD International Conference on Management of Date, Washington DC, 1993.207~216
  • 6[2]Agrawal R, Srikant R. Fast algorithm for mining association rules. In: Proceedings of the 20th International Conference on VLDB, Santiago, Chile, 1994. 487~499
  • 7[3]Han J, Kamber M. Data Mining: Concepts and Techniques. Beijing: Higher Education Press, 2001
  • 8[5]Agrawal R, Shafer J C. Parallel mining of association rules:Design, implementation, and experience. IBM Research Report RJ 10004,1996
  • 9[6]Savasere A, Omiecinski E, Navathe S. An efficient algorithm for mining association rules. In: Proceedings of the 21th International Conference on VLDB, Zurich, Switzerland, 1995. 432~444
  • 10[7]Hah J, Jian P et al. Mining frequent patterns without candidate generation. In: Proceedings of ACM SIGMOD International Conference on Management of Data, Dallas, TX, 2000.1~12

共引文献94

同被引文献122

引证文献13

二级引证文献48

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部