期刊文献+

基于单向频繁模式树的频繁项集挖掘算法 被引量:3

Frequent Itemset Mining Algorithm Based on UFP-tree
下载PDF
导出
摘要 频繁项集挖掘是关联规则挖掘的关键步骤。FP-Growth算法是一种有效的频繁项集挖掘算法,它以自底向上的方式探索频繁模式树FP-tree,由FP-tree产生频繁项集。但是由于需要递归生成大量的条件FP-tree,其时间复杂度和空间复杂度都较高。针对这一问题,设计了一种基于单向频繁模式树的频繁项集挖掘算法UFIM。此算法首先构造一种单向频繁模式树UFP-tree结构,然后在UFP-tree上引入被约束子树,并对指向不同端点和指向相同端点的被约束子树分别采用递归和非递归的方法来挖掘频繁项集。非递归的方法判断端点的支持度计数是否小于最小支持度计数,若小于最小支持度计数则该棵被约束子树无频繁项集,否则其频繁项集是除根节点外的节点的排列组合。在mushroom数据集上的实验结果表明,UFIM算法的运行速度高于同类算法。 Mining frequent itemset is a key step in mining association rules.The FP-Growth algorithm is an efficient frequent itemset mining algorithm which explores the frequent pattern tree(FP-tree)by a bottom-up way,and generates frequent items by mining the FP-tree.However,its time complexity and space complexity are high because of needing to recursively generate a large number of conditional FP-tree.Aiming at this problem,we design a frequent itemset mining algorithm named UFIM based on unidirectional frequent pattern tree.This algorithm first constructs a unidirectional frequent pattern tree(UFP-tree)structure,then introduces a constrained sub-tree on the constructed UFP-tree;divides the constrained sub-tree into two cases:pointing to different endpoints and pointing to the same endpoints,and respectively uses recursive method and non-recursive method to mine frequent itemset.The non-recursive method determines whether the endpoint’s support count is smaller than the minimum support count.If it is smaller,the restricted sub-tree does not have frequent itemset,otherwise the frequent itemset of the restricted subtree is a node arrangement combination of the nodes besides root node.The experiment of mining frequent item set on the mushroom dataset shows that the running speed of the UFIM algorithm is higher than similar algorithms.
作者 蒋东洁 李玲娟 JIANG Dong-jie;LI Ling-juan(School of Computer Science,Nanjing University of Posts and Telecommunications,Nanjing 210023,China)
出处 《计算机技术与发展》 2019年第10期175-180,共6页 Computer Technology and Development
基金 国家自然科学基金(61302158,61571238)
关键词 数据挖掘 频繁项集 单向频繁模式树 被约束子树 data mining frequent itemset UFP-tree constrained sub-tree
  • 相关文献

参考文献7

二级参考文献49

  • 1杨健兵.数据挖掘中关联规则的改进算法及其实现[J].微计算机信息,2006(07X):195-197. 被引量:26
  • 2[1]J Han,Micheline Kamber. Data Mining:Concepts and Techniques[M].Morgan Kaufmann Publishers,2001
  • 3[2]R Agrawal,R Srikant. Fast algorithms for mining association rules[C].In: VLDB ′94,1994: 487~499
  • 4[3]R Agrawal ,T Imielinski ,A Swami. Mining association rules between sets of items in large databases[C].In:Proc 1993 ACM-SIGMOD Int Conf Management of Data (SIGMOD′93), Washington, DC, 1993-05:207~216
  • 5[4]J S Park ,M S Chen,P S Yu. An effective hash-based algorithm for mining association rules[C].In:SIGMOD'95,1995:175~186
  • 6[5]J Han,J Pei,Y Yin. Mining frequent patterns without candidate generation[C].In: Proc ACM SIGMOD, 2000:1~12
  • 7[6]C A Shaffer. Data Structures and Algorithm Analysis[M].Prentice Hall,1997
  • 8[1]R Agrawal,R Srikant.Fast algorithms for mining association rules.In:J Bocca,M Jarke,C Zaniolo,eds.Proc of the 20th Int'l Conf on Very Large DataBases (VLDB'94).San Francisco:Morgan Kaufmann,1994.487-499
  • 9[2]M Zaki,S Parthasarathy,M Ogihara,et al.New algorithms for fast discovery of association rules.In:D Heckerman,et al,eds.Proc of the 3rd Int'l Conf on Knowledge Discovery and Data Mining (KDD'97).Menlo Park,CA:AAAI Press,1997
  • 10[3]J Han,J Pei,Y Yin.Mining frequent patterns without candidate generation.In:M Dunham,J Naughton,W Chen,eds.Proc of 2000 ACM-SIGMOD Int'l Conf on Management of Data (SIGMOD'00).New York:ACM Press,2000.1-12

共引文献113

同被引文献31

引证文献3

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部