期刊文献+

基于优化的FP-Tree的频繁闭合项集挖掘算法

Frequent Closed Itemsets Mining Algorithm Based on Improved FP-growth
下载PDF
导出
摘要 在经典的频繁闭合项集挖掘算法中,如Closet与Closet+,当条件模式数据库很庞大时,频繁项集的数目将会急剧增长,算法的效率会逐步恶化,并且算法挖掘结果的有效性也随着大量冗余模式的产生而下降.本文首先针对传统的FP-tree的算法,给出了一种改进的FP-tree算法,然后在新算法的基础上,提出新的频繁闭合项集挖掘算法,该算法只需把FP-Tree中所有由叶子结点到根结点的路径遍历一遍,就可以得到各项的所有子条件模式基,避免了传统FP-tree算法在同一条路径上向前回溯比较的繁琐.实验表明优化后的算法避免了资源的耗费,减少了频繁闭合项集挖掘的运算开销,大大提高了数据挖掘的效率. The classic mining algorithms for mining frequent itemsets, such as Closet and Closet +, are proved to be inefficient and produce many redundant patterns, when mining extremely large datasets. This paper gives a new method to improve the performance of FP-tree firstly. Then based on the improved FP-tree a frequent closed itemsets mining algorithm is provided to improve the effectiveness of mining frequent close itemsets. The new algorithm optimizes the process of mining frequent itemsets and does not need to build conditional FP-tree recursively. The experimental results show that the new approach can save execution time. The feasibility and effectiveness of this new algorithm are also proved by experiments.
出处 《曲阜师范大学学报(自然科学版)》 CAS 2009年第2期57-61,共5页 Journal of Qufu Normal University(Natural Science)
关键词 数据挖掘 闭合项集 频繁模式增长 data mining closed itemsets FP-growth
  • 相关文献

参考文献15

  • 1Agrawal R, Imielinski T, Swami A Mining association rules between sets of items in large databases [ J ]. SIGMOD93, May 1993.
  • 2Agrawal R, Srikant R. Fast algorithms for mining association rules [J]. VLDB94, Sept. 1994.
  • 3Bayardo R J. Effciently Mining long patterns from databases. SIGMOD'98, 1998.
  • 4Brin S, Motwani R, Ullman J D, et al Dynamic hemset Counting and Implication Rules for Market Basket Data [J]. SIGMOD'97, 1997.
  • 5Pei J, Han J, Mao R. CLOSET: An effcient algorithm for mining frequent closed itemsets [ J]. DMKD'00, 2000.
  • 6Gunopulos D, Mannila H, Saluja S. Discovering All Most Specific Sentences by Randomized Algorithms. ICDT'97, 1997.
  • 7Han E, Karypis G, Kumar V, Scalable Parallel Data Mining for Association Rules [J]. TKDE, 2000.12(2).
  • 8Han J, Pei J, Yin Y. Mining frequent patterns without candidate generation [ J]. SIGMOD00, 2000.
  • 9Wang Jianyong , Han Jiawei, Pei Jian Closet + : Searching for the Best Strategies for mining frequent closed itemsets [ J ] SIGKDD'03.
  • 10] Liu J, Pan Y, Wang K, et al. Mining frequent item sets by opportunistic projection [J]. SIGKDD'02, 2002.

二级参考文献13

  • 1AGRAWAL R, IMIELINSKI T, SWAMI A. Mining association rules between sets of items in large databases[ A]. In Proc. 1993 ACM-SIGMOD Int. Conf. Management of Data ( SIGMOD' 93)[C]. Washington, DC, 1993.207-216.
  • 2MANNILA H, TOIVONEN H, VERKAMO I. Efficient Algorithms for Discovering Association Rules[ A]. Proc. ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining[ C], 1984. 181 -192.
  • 3AGRAWAL R, SRIKANT R.Fast algorithms for mining association roles[ A]. In Proc. 1994 Int. Conf. VeryLarge Data Bases ( VLDB'94)[C]. Santiago, Chile, 1994. 487 -499.
  • 4DONG G, LI J.Efficient mining of emerging patterns: Discovering trends and differences[A]. In Proc. 1999 Int. Conf. Knowledge Discovery and Data Mining (KDD' 99)[ C]. San Diego, CA, 1999.43 - 52.
  • 5AGRAWAL R, MANNILA H, SRIKANT R, et al. Fast discovery of association rules[ M]. In Advances in Knowledge Discovery and Data Mining, U.M. Fay'gad, G. Piatetsky-Shapiro, P, Smylh, and R,Uthurusamy ( Eds. ), AAAI/MIT Press, 1996, 307 -328.
  • 6AGRAWAL R, SRIKANT R, Mining sequential patterns[ A]. In Proc, 1995 Int. Conf. Data Engineering ( ICDE' 95) [ C]. Taipei,Taiwan, 1995. 3 - 14.
  • 7PARK JS, CHEN MS, YU PS. An effective hash-based algorithm for mining association rules[ A]. In Proc. 1995 ACM-SIGMOD Int,Conf. Management of Data ( SIGMOD' 95) [ C]. San Jose, CA,1995. 175 - 186.
  • 8PEI J, HAN J, MORTAZAVI-ASL B, et al. PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern growth[ A].In Proc. 2001 Int. Conf. Data Engineering ( ICDE' 01) [ C]. Heidelberg, Germany, 2001. 215 - 224.
  • 9SRIKANT R, AGRAWAL R, Mining sequential patterns: Generalizations and performance improvements[ A]. In Prec. 5th Int. Conf.Extending Database Technology ( EDBT' 96) [ C]. Avignon, France,1996. 3 - 17.
  • 10SAVASERE A, OMIECINSKI E, NAVATHE S. An efficient algorithm for mining association roles in large databases[ A]. In Proc.1995 Int. Conf. Very Large Data Bases (VLDB' 95)[C]. Zurich,Switzerland, 1995. 432 - 443.

共引文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部