期刊文献+

基于MapReduce的并行关联规则增量更新算法 被引量:12

Incremental Updating Algorithm of Parallel Association Rule Based on MapReduce
下载PDF
导出
摘要 为解决传统关联规则挖掘算法在大数据环境下运行效率较低的问题,基于频繁模式增长(FP-growth)算法,提出一种面向大数据的并行关联规则增量更新算法。利用MapReduce编程模型与云计算平台,对FP-growth算法各步骤进行并行化处理。在增量更新挖掘过程中,使用已有的频繁项集和1-项集对新增事务集构建频繁模式树,通过扫描原始事务数据库完成频繁项集的更新。实验结果表明,与传统关联规则挖掘算法相比,该算法具有更高的挖掘效率和扩展性,适用于海量数据的关联规则增量挖掘。 Under the environment of big data,the traditional association rule mining algorithms have lower efficiency caused by the rapidly increasing data. Aiming at the problem,this paper proposes a parallel incremental updating algorithm of association rules based on the Frequent Pattern Grow th( FP-growth) algorithm. Each step of incremental FP growth algorithm is realized to parallel process by using the MapReduce programming model and cloud computing platform. In the updating process,it uses the existing incremental of frequent itemsets and 1-set to construct frequent pattern tree of the new transaction after completing frequent itemsets updating by scanning the original transaction database one time. Experimental results show that the algorithm has better efficiency and expansibility compared with the traditional association rule mining algorithm,therefore it can be applied to the association rules incremental mining of massive data.
作者 程广 王晓峰
出处 《计算机工程》 CAS CSCD 北大核心 2016年第2期21-25,32,共6页 Computer Engineering
关键词 大数据 云计算 MapReduce编程模型 频繁项集 增量更新 关联规则 big data cloud computing MapReduce programming model frequent itemset incremental updating association rule
  • 相关文献

参考文献14

  • 1Agrawal R,Imielinski T,Swami A.Mining Association Rules Between Sets of Items in Large Database[C]//Proceedings of 1993 ACM-SIGMOD International Conference on Management of Date.New York,USA:ACM Press,1993:206-216.
  • 2Han Jiawei,Pei Jian,Yin Yiwen.Mining Frequent Patterns Without Candidate Generation[C]//Proceedings of 2000 ACM SIGMOD International Conference on Management of Data.New York,USA:ACM Press,2000:1-12.
  • 3Cheung D W.Maintenance of Discovered Association Rules in Large Database:An Incremental Updating Approach[C]//Proceedings of the 12th IEEE International Conference on Data Engineering.Washington D.C.,USA:IEEE Press,1996:106-114.
  • 4Ayan N F,Tansel A U,Arkun E.An Efficient Algorithm to Update Large Itemsets with Early Pruning[C]//Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York,USA:ACM Press,1999:287-291.
  • 5Hong T P.A Fast Updated Frequent Pattern Tree[C]//Proceedings of IEEE International Conference on Systems,Man,and Cybernetics.Washington D.C.,USA:IEEE Press,2006:2167-2172.
  • 6Lin Chunwei,Hong T P.Using the Pre-FUFP Algorithm for Handling New Transactions in Incremental Ming[C]//Proceedings of CIDM’07.Washington D.C.,USA:IEEE Press,2007:598-603.
  • 7孟小峰,慈祥.大数据管理:概念、技术与挑战[J].计算机研究与发展,2013,50(1):146-169. 被引量:2391
  • 8Yang Xinyue,Liu Zhen,Fu Yan.MapReduce as Programming Model for Association Rules Algorithm on Hadoop[C]//Proceedings of the 3rd International Conference on Information Sciences and Interaction Sciences.Washington D.C.,USA:IEEE Press,2010:99-102.
  • 9Li Haoyuan,Wang Yi,Zhang Dong,et al.PFP:Parallel FPGrowth for Query Recommendation[C]//Proceed-ings of 2008 ACM Conference on Recommended Systems.New York,USA:ACM Press,2008:125-137.
  • 10朱晓峰,李玲娟,徐小龙,陈建新.基于MapReduce的关联规则增量更新算法[J].计算机技术与发展,2012,22(4):115-118. 被引量:15

二级参考文献192

  • 1秦亮曦,苏永秀,刘永彬,梁碧珍.基于压缩FP-树和数组技术的频繁模式挖掘算法[J].计算机研究与发展,2008,45(z1):244-249. 被引量:16
  • 2Cheung D W. Maintenance of Discovered Association Rules in Large Databases: An Incremental Updating Approach[C]//Proc. of the 12th IEEE International Conference on Data Engineering. New Orleans, USA: [s. n.], 1996: 106-114.
  • 3Hong Tzung-Pei, Wang Ching-Yao, Tao Yuhui. A New Incremental Data Mining Algorithm Using Pre-large hemsets[J]. Intelligent Data Analysis, 2001, 5(2): 111 - 129.
  • 4Hong Tzung-Pei. A Fast Updated Frequent Pattern Tree[C]//Proc. of the IEEE International Conference on Systems, Man, and Cybernetics. Taiwan, China: [s. n.], 2006: 2167-2172.
  • 5Lin Chunwei, Hong Tzung-Pei. Using the Pre-FUFP Algorithm for Handling New Transactions in Incremental Mining[C]//Proc. of CIDM'07. [S. l.]: IEEE Press, 2007: 598-603.
  • 6Agrawal R, Imielinski T, Swami A. Mining Association Rules between Sets of Items in Large Databases[ C]//Proceedings of the 1993 ACM- SIGMOD International Conference on Management of Data, 1993:207 - 216.
  • 7Hart J ,Pei J,Yin Y. Mining frequent patterns without candidate genera- tion[ C]//Proc. 2000 ACM-SIGMOD Int. Conf. Management of Data ( SIGMOD'00 ) ,2000 : 1 - 12.
  • 8Cheung D, Han Jiawei, Vincent T N, et al. Maintenance of Discovered Association Rules in Large Databases: An Incremental Updating Tech- nique[ C]//Proc. of the 12th Int'l Conf. on Data Engineering. New Orleans, Louisiana, USA, 1996.
  • 9Ayan N F, Tansel A U, Arkun E. An Efficient Algorithm to Update Large Itemsets with Early Pruning[ C]//ACM SIGKDD Ind. Conf. on Knowledge Discovery in Data and Data Mining( SIGKDD'99 ) ,San Die- go,California,August 1999.
  • 10Hong T P, Lin J W, Wu Y L. A fast updated frequent pattern tree [ C ]//The IEEE International conference on systems, man, and cyber- netics, 2006:2167 - 2172.

共引文献2407

同被引文献88

引证文献12

二级引证文献43

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部