基于有限个条件FP_树中挖掘频繁模式

Mining frequent item sets from several conditional FP_trees

下载PDF

导出

摘要在数据挖掘中发现关联规则是一个基本问题,而关联规则发现中最昂贵的步骤便是寻找频繁模式。FP_growth(FrequentPatern growth)方法在产生长短频繁项集时不产生候选项集,从而大大提高了挖掘的效率,但是FP_growth在挖掘频繁模式时候产生大量的条件FP树从而占用大量空间,对FP_growth进行研究并提出一种改进算法,该算法不仅利用FP_growth算法所有优点而且避免了FP_growth的缺陷。主要通过建立有限棵条件FP树(数目为事务数据库的属性个数)来挖据长短频繁模式,大大节省了FP_growth算法所需要空间,实验证明该文算法是有效的。 Discovering association rules is a basic problem in data mining.Finding frequent item sets is the most expensive step in association rule discovery.Analysing a frequent pattern growth（FP_growth） method is effieient for mining both long and short frequent patterns without candidate generation,but FP_growth would generate a huge number of conditional FP_trees and then occupied memory space,so proposing a new efficient algorithm not only heirs all the advantages in FP_growth method,but also avoids its bottleneck.By establishing several conditional FP_trees （the number is equal the number of database＇s items） to mine long and short frequent item sets,the improved algorithm could save memory space significantly.Performance study also shows that the improved method is efficient.

作者林丽冯少荣薛永生

机构地区厦门大学计算机科学系

出处《计算机工程与应用》 CSCD 北大核心 2007年第5期175-177,共3页 Computer Engineering and Applications

基金福建省自然科学基金(the Natural Science Foundation of Fujian Province of China under Grant No.A0310008) 福建省高新技术研究开放计划重点项目(2003H043)

关键词关联规则 FP_growth 频繁模式条件FP树 association rules FP_growth frequent item sets conditional FP_tree

分类号 TP311 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献8

1QIU You,LAN Yong-jie,XIE Qing-song.An improved algorithm of mining from FP-TREE[C]//Proceedings of the Third International Conference on Machine Learning and Cybernedcs,Shanghai,26-29August 2004.
2朱玉全,孙志挥,季小俊.基于频繁模式树的关联规则增量式更新算法[J].计算机学报,2003,26(1):91-96. 被引量：80
3冯玉才,冯剑琳.关联规则的增量式更新算法[J].软件学报,1998,9(4):301-306. 被引量：227
4Han J,Kamber M.Data mining:concepts and techniques[M].Beijing:Higher Education Press,2001.
5Han J,Jian P.Mining frequent patterns without candidate generation[C]//Proceedings of ACM SIGMOD International Conference on Management of Data,Dallas,TX,2000:1-12.
6范明,王秉政.一种直接在Trans-树中挖掘频繁模式的新算法[J].计算机科学,2003,30(8):117-120. 被引量：10
7杨明,孙志挥,吉根林.快速挖掘全局频繁项目集[J].计算机研究与发展,2003,40(4):620-626. 被引量：35
8范明,李川.在FP-树中挖掘频繁模式而不生成条件FP-树[J].计算机研究与发展,2003,40(8):1216-1222. 被引量：56

二级参考文献29

1RAgrawa1 TImie1inSki Aswami.Mining association ru1es between sets of items in 1arge database[J].The ACM SIGMOD Intemationa1 Conf on Management of Data, Washington, DC,1993,.
2[1]Agrawal R, Imielinski T, Swami A. Mining association rules between sets of items in large databases. In: Proceedings of ACM SIGMOD International Conference on Management of Date, Washington DC, 1993.207～216
3[2]Agrawal R, Srikant R. Fast algorithm for mining association rules. In: Proceedings of the 20th International Conference on VLDB, Santiago, Chile, 1994. 487～499
4[3]Han J, Kamber M. Data Mining: Concepts and Techniques. Beijing: Higher Education Press, 2001
5[5]Agrawal R, Shafer J C. Parallel mining of association rules:Design, implementation, and experience. IBM Research Report RJ 10004,1996
6[6]Savasere A, Omiecinski E, Navathe S. An efficient algorithm for mining association rules. In: Proceedings of the 21th International Conference on VLDB, Zurich, Switzerland, 1995. 432～444
7[7]Hah J, Jian P et al. Mining frequent patterns without candidate generation. In: Proceedings of ACM SIGMOD International Conference on Management of Data, Dallas, TX, 2000.1～12
8[8]Cheung D W, Lee S D, Kao B. A general incremental technique for maintaining discovered association rules. In: Proceedings of databases systems for advanced applications, Melbourne, Australia, 1997. 185～194
9[10]Han J, Jian P. Mining access patterns efficiently from web logs. In: Proceedings of Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'00), Kyoto, Japan,2000. 396～407
10[11]Agrawal R, Srikant R. Mining sequential pattern. In: Proceedings of the 11th International Conference on Data Engineering, Taipei, 1995. 3～14

共引文献368

1徐龙,杨君锐.基于数据库变化的关联规则增量式更新算法[J].重庆科技学院学报（自然科学版）,2007,9(4):67-70. 被引量：1
2简友光,简曙光.空间数据关联规则挖掘研究综述[J].计算机与数字工程,2007,35(7):52-55.
3李泓冰.WTO的眉批:伤脑筋的道德成本[J].理论参考,2002(S1):46-46.
4秦亮曦,苏永秀,刘永彬,梁碧珍.基于压缩FP-树和数组技术的频繁模式挖掘算法[J].计算机研究与发展,2008,45(z1):244-249. 被引量：16
5谢志强,朱孟杰,杨静.基于改进FP-树的最大项目集挖掘算法[J].计算机应用研究,2009,26(2):502-505. 被引量：1
6蔡高明.一种快速挖掘模糊频繁项集的方法[J].科技经济市场,2008(3):5-6.
7敬会.关联规则增量式更新算法[J].科技资讯,2007,5(26).
8廖启明.基于数据新增关联规则的更新算法研究[J].光盘技术,2007(6):19-21.
9钱进,孟祥萍,徐冬寅.一种有效的关联规则增量式更新算法[J].长春工程学院学报（自然科学版）,2003,4(3):11-14. 被引量：4
10庄蔚蔚,姜青山.恶意软件鉴别技术及其应用[J].集成技术,2012,1(1):55-64. 被引量：3

1孙鸿艳,吉根林.一种新的基于FP_Growth的频繁项目集并行挖掘算法[J].南京师大学报（自然科学版）,2016,39(4):19-24. 被引量：3
2刘良旭,蔡曜镫,王杰.基于FP__Growth算法的路段拥堵分析[J].宁波工程学院学报,2016,28(3):6-11.
3段富,薛永鹏.一种数据库入侵检测模型的设计与实现[J].微计算机信息,2009,25(36):66-67.
4汪峰坤,张婷婷.一种基于有向图的多维多值属性关联规则挖掘算法[J].宿州学院学报,2015,30(12):99-101. 被引量：1
5黄伟,李国和,吴卫江,洪云峰,刘智渊,程远.基于FP_Growth的消费行为关联分析系统设计与实现[J].计算机应用与软件,2015,32(8):34-37. 被引量：1
6张帆,夏红霞,袁景凌,沈琦.入侵检测系统中关联规则的挖掘[J].湖北工业大学学报,2006,21(3):215-218.
7欧阳继红,王仲佳,刘大有.具有动态加权特性的关联规则算法[J].吉林大学学报（理学版）,2005,43(3):314-319. 被引量：16
8高俊,何守才.布尔型关联规则挖掘算法研究[J].计算机工程,2006,32(1):116-118. 被引量：5
9杜永生.关联规则的精简方法研究[J].赤峰学院学报（自然科学版）,2011,27(10):33-34.
10罗芳.一种基于裁剪FP-Tree的频繁项集挖掘算法[J].宜春学院学报,2015,37(12):22-25. 被引量：1

计算机工程与应用

2007年第5期

浏览历史

内容加载中请稍等...

基于有限个条件FP_树中挖掘频繁模式

参考文献8

二级参考文献29

共引文献368

相关作者

相关机构

相关主题

浏览历史