摘要
挖掘最大频繁项集的算法多基于局部数据库,为此提出了一种基于频繁模式树的快速挖掘全局最大频繁项集算法。该算法首先挖掘出所有全局频繁项目组成集合,然后各个节点根据该集合构建局部频繁模式树,最后将该集合作为全局最大频繁候选项集,采用自顶向下策略挖掘出所有的全局最大频繁项集。与类Apriori算法相比,该算法采用的频繁模式树结构能大幅度降低数据库扫描次数和运行时间;自顶向下的策略能大幅度减少候选项集数和通信量。实验结果表明,该算法是快速和高效的。
Most mining maximum frequent itemsets algorithm based on local data base,so a fast algorithm for Mining Global Maximum Frequent Itemsets based on Frequent pattern tree(MGMFIF) was proposed.MGMFIF mined all global frequent items and made itemset,then local Frequent-Pattern tree(FP-tree) of each node was constructed based on this itemset.Finally,this itemset was chose as global maximum frequent itemsets,and all the global maximum frequent itemsets were obtained by top-down strategy.By adopting FP-tree structure,MGMFIF greatly reduced database scanning times and runtime comparing to Apriori-like algorithms.MGMFIF remarkably lessened candidate itemsets and communication traffic by using top-down strategy.Experimental results suggested that MGMFIF was fast and effective.
出处
《计算机集成制造系统》
EI
CSCD
北大核心
2011年第7期1547-1552,共6页
Computer Integrated Manufacturing Systems
关键词
数据挖掘
频繁模式树
全局最大频繁项集
算法
data mining
frequent-pattern tree
global maximum frequent itemsets
algorithms