期刊文献+

一种基于后缀项表的并行闭频繁项集挖掘算法 被引量:11

Parallel closed frequent itemset mining algorithm with postfix-table
下载PDF
导出
摘要 对现有的基于MapReduce的并行频繁项集挖掘算法进行了研究,提出一种基于后缀项表的并行闭频繁项集挖掘算法,通过后缀项表的引入及以闭频繁项集挖掘的形式,减少组分间的数据传送量,提高挖掘效率。实验表明,该算法可以有效缩短平均挖掘时间,对于高维大数据具有较好的性能。 Abstract: Based on current frequent itemsets mining using parallel FP-Growth algorithm with MapReduce framework, this pa- per proposed a parallel closed frequent itemsets mining algorithm with a postfix-table based on MapReduce framework. The algorithm generated closed frequent itemsets instead of a11 frequent itemsets. With a postfix-table structure, the algorithm could reduce the amount of data transfer between mappers and reducers efficiently. The experimental results show that the algorithm can shorten mining time efficiently. The algorithm has ~ood performance especially in long- transction mode.
出处 《计算机应用研究》 CSCD 北大核心 2014年第2期373-377,共5页 Application Research of Computers
基金 国家自然科学基金资助项目(61170277) 上海市教委科研创新重点项目(12zz137) 上海市一流学科建设项目(S1205YLXK)
关键词 频繁项集挖掘 并行挖掘算法 MAPREDUCE 闭频繁项集 后缀项表 frequent itemsets mining parallel mining algorithm MapReduce closed frequent itemsets postfix-table
  • 相关文献

参考文献14

  • 1HAN Jia-wei, CHENG Hong, XIN Dong, et al. Frequent pattern mi- ning: current status and future directions [J]. Data Mining and Knowledge Discovery,2007,15( 1 ) :55-86.
  • 2AGRAWALR,IMIELISKIT,SWAMIA.Miningassociationrulesbetweensetsofitemsinlargedatabases[J].ACM SIGMOD Record,1993,22(2):207-216.
  • 3HANJiawei,PEIJian,YINYiwen.Miningfrequentpatternswithoutcandidategeneration[J].ACMSIGMODRecord,2000,29(2):1-12.
  • 4ZA?ANEOR,ELHAJJM,LUP.Fastparallelassociationruleminingwithoutcandidacygeneration[C]//ProcofIEEE International ConferenceonDataMining.2001:665-668.
  • 5PRAMUDIONOI,KITSUREGAWA M.ParallelFPgrowthonPCcluster[C]//Procofthe7thPacificAsiaConferenceonAdvancesinKnowledgeDiscoveryandDataMining.Berlin: SpringerVerlag,2003:467-473.
  • 6LILi,ZHAIDong,JINFan.Aparallelalgorithmforfrequentitemsetmining[C]//Procofthe4thInternationalConferenceonParallelandDistributedComputing,ApplicationsandTechnologies.2003:868-871.
  • 7DEANJ,GHEMAWATS.MapReduce:simplifieddataprocessingonlargeclusters[J].CommunicationsoftheACM,2008,51(1):107-113.
  • 8LIHaoyuan,WANGYi,ZHANGDong,etal.PFP:parallelFPGrowthforqueryrecommendation[C]//ProcofACM ConferenceonRecommenderSystems.2008:107-114.
  • 9OWENS,ANILR,DUNNINGT,etal.Mahoutinaction[M].[S.l.]:ManningPublications,2011.
  • 10WANGSuqi,YANGYubin,CHENGuangpeng,etal.MapReducebasedclosedfrequentitemsetminingwithefficientredundancyfiltering[C]//Procofthe12thIEEEInternationalConferenceonDataMiningWorkshops.2012:449-453.

同被引文献47

  • 1黄敏,蔡志刚.缓存替换算法研究综述[J].计算机科学,2006,33(B12):191-193. 被引量:10
  • 2王黎明,张卓.基于iceberg概念格并置集成的闭频繁项集挖掘算法[J].计算机研究与发展,2007,44(7):1184-1190. 被引量:25
  • 3HanJiawei MichelineKamber.数据挖掘概念与技术[M].北京:机械工业出版社,2004..
  • 4Calders T,Rigotti C,Boulicaut J F.A survey on condensed representations for frequent sets[C] //Constraint-Based Mining and Inductive Databases.Berlin:Springer,2005:64-80.
  • 5Lucchese C,Orlando S,Perego R.Fast and memory efficient mining of frequent closed itemsets[J].IEEE Trans on Knowledge and Data Engineering,2006,18(1):21-36.
  • 6Pasquier N,Bastide Y,Taouil R,et al.Discovering frequent closed itemsets for association rules[C] //Proc of the 7th International Conference on Database Theory.Berlin:Springer,1999:398-416.
  • 7Boulicaut J F,Bykowski A,Rigotti C.Free-sets:a condensed representation of boolean data for the approximation of frequency queries[J].Data Mining and Knowledge Discovery,2003,7(1):5-22.
  • 8Cheng J,Ke Yiping,Ng W.δ-tolerance closed frequent itemsets[C] //Proc of the 6th IEEE International Conference on Data Mining.2006:139-148.
  • 9Jr Bayardo R J.Efficiently mining long patterns from databases[J].ACM SIGMOD Record,1998,27(2):85-93.
  • 10Burdick D,Calimlim M,Flannick J,et al.MAFIA:a maximal frequent itemset algorithm for transactional databases[J].IEEE Trans on Knowledge and Data Engineering,2005,17(11):1490-1504.

引证文献11

二级引证文献44

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部