期刊文献+

Efficient Incremental Maintenance of Frequent Patterns with FP-Tree 被引量:9

Efficient incremental maintenance of frequent patterns with FP-tree
原文传递
导出
摘要 Mining frequent patterns has been studied popularly in data mining area. However, little work has been done on mining patterns when the database has an influx of fresh data constantly. In these dynamic scenarios, efficient maintenance of the discovered patterns is crucial. Most existing methods need to scan the entire database repeatedly, which is an obvious disadvantage. In this paper, an efficient incremental mining algorithm, Incremental-Mining (IM), is proposed for maintenance of the frequent patterns when incremental data come. Based on the frequent pattern tree (FP-tree) structure, IM gives a way to make the most of the things from the previous mining process, and requires scanning the original data once at most. Furthermore, IM can identify directly the differential set of frequent patterns, which may be more informative to users. Moreover, IM can deal with changing thresholds as well as changing data, thus provide a full maintenance scheme. IM has been implemented and the performance study shows it outperforms three other incremental algorithms: FUP, DB-tree and re-running frequent pattern growth (FP-growth). Keywords data mining - association rule mining - frequent pattern mining - incremental mining Supported by the National Basic Research 973 Program of China under Grant No.G1999032705.Xiu-Li Ma received the Ph.D. degree in computer science from Peking University in 2003. She is currently a postdoctoral researcher at National Lab on Machine Perception of Peking University. Her main research interests include data warehousing, data mining, intelligent online analysis, and sensor network.Yun-Hai Tong received the Ph.D. degree in computer software from Peking University in 2002. He is currently an assistant professor at School of Electronics Engineering and Computer Science of Peking University. His research interests include data warehousing, online analysis processing and data mining.Shi-Wei Tang received the B.S. degree in mathematics from Peking University in 1964. Now, he is a professor and Ph.D. supervisor at School of Electronics Engineering and Computer Science of Peking University. His research interests include DBMS, information integration, data warehousing. OLAP, and data mining, database technology in specific application fields. He is the vice chair of the Database Society of China Computer Federation.Dong-Qing Yang received the B.S. degree in mathematics from Peking University in 1969. Now, she is a professor and Ph.D supervisor at School of Electronics Engineering and Computer Science of Peking University. Her research interests include database design methodology, database system implementation techniques, data warehousing and data mining, information integration and sharing in Web environment. She is a member of academic committee of Database Society of China Computer Federation. Mining frequent patterns has been studied popularly in data mining area. However, little work has been done on mining patterns when the database has an influx of fresh data constantly. In these dynamic scenarios, efficient maintenance of the discovered patterns is crucial. Most existing methods need to scan the entire database repeatedly, which is an obvious disadvantage. In this paper, an efficient incremental mining algorithm, Incremental-Mining (IM), is proposed for maintenance of the frequent patterns when incremental data come. Based on the frequent pattern tree (FP-tree) structure, IM gives a way to make the most of the things from the previous mining process, and requires scanning the original data once at most. Furthermore, IM can identify directly the differential set of frequent patterns, which may be more informative to users. Moreover, IM can deal with changing thresholds as well as changing data, thus provide a full maintenance scheme. IM has been implemented and the performance study shows it outperforms three other incremental algorithms: FUP, DB-tree and re-running frequent pattern growth (FP-growth). Keywords data mining - association rule mining - frequent pattern mining - incremental mining Supported by the National Basic Research 973 Program of China under Grant No.G1999032705.Xiu-Li Ma received the Ph.D. degree in computer science from Peking University in 2003. She is currently a postdoctoral researcher at National Lab on Machine Perception of Peking University. Her main research interests include data warehousing, data mining, intelligent online analysis, and sensor network.Yun-Hai Tong received the Ph.D. degree in computer software from Peking University in 2002. He is currently an assistant professor at School of Electronics Engineering and Computer Science of Peking University. His research interests include data warehousing, online analysis processing and data mining.Shi-Wei Tang received the B.S. degree in mathematics from Peking University in 1964. Now, he is a professor and Ph.D. supervisor at School of Electronics Engineering and Computer Science of Peking University. His research interests include DBMS, information integration, data warehousing. OLAP, and data mining, database technology in specific application fields. He is the vice chair of the Database Society of China Computer Federation.Dong-Qing Yang received the B.S. degree in mathematics from Peking University in 1969. Now, she is a professor and Ph.D supervisor at School of Electronics Engineering and Computer Science of Peking University. Her research interests include database design methodology, database system implementation techniques, data warehousing and data mining, information integration and sharing in Web environment. She is a member of academic committee of Database Society of China Computer Federation.
出处 《Journal of Computer Science & Technology》 SCIE EI CSCD 2004年第6期876-884,共9页 计算机科学技术学报(英文版)
基金 国家重点基础研究发展计划(973计划)
关键词 data mining association rule mining frequent pattern mining incremental mining data mining association rule mining frequent pattern mining incremental mining
  • 相关文献

参考文献13

  • 1Agrawal R, Srikant R. Fast algorithm for mining association rules. In Proc. 20th Int. Conf. Very Large Data Bases, Santiago de Chile, Chile, September 12-15, 1994,pp.487-499.
  • 2Zaki M J. Scalable algorithms for association mining.IEEE Trans. Knowledge and Data Engineering. 2000,12(3): 372-390.
  • 3Han J, Pei J, Yin Y. Mining frequent patterns without candidate generation. In Proc. 2000 ACM-SIGMOD Int. Conf. Management of Data, Dallas, TX, USA,May 14-19, 2000, pp.1-12.
  • 4Pei J, Han J, Lu H, Nishio S, Tang S, Yang D. H-Mine:Hyper-structure mining of frequent patterns in large databases. In Proc. 2001 Int. Conf. Data Mining,San Jose, CA, USA, Nov.29-Dec.2, 2001, pp.441-448.
  • 5Tseng F, Hsu C. Generating frequent patterns with the Frequent Pattern List. Lecture Notes in Artificial Intelligence 2035, Cheung D, Williams G J, Li Q (eds.),Springer-Verlag, 2001, pp.376-386.
  • 6Cheung D, Han J, Ng V, Wong C. Maintenance of discovered association rules in large databases: An incremental updating technique. In Proc. 12th Int. Conf.Data Engineering, New Orleans, Louisiana, Feb. 26-Mar. 1, 1996, pp.106-114.
  • 7Cheung D, Lee S, Kao B. A general incremental technique for maintaining discovered association rules. In Proc. 5th Int. Conf. Database Systems for Advanced Applications, Melbourne, Australia, April 1-4, 1997,pp.185-194.
  • 8Lee S, Cheung D. Maintenance of discovered association rules: When to update? In Proc. 1997 SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery (DMKD'97), Tucson, Arizona, May 11,1997.
  • 9Ezeife C I, Su Y. Mining incremental association rules with generalized FP-tree. Lecture Notes in Computer Science 2338, Cohen R, Spencer B (eds.), Springer-Verlag, 2002, pp.147-160.
  • 10Liu J, Yin J. Towards efficient data re-mining (DRM).Lecture Notes in Artificial Intelligence 2035, Cheung D, Williams G J, Li Q (eds.), Springer-Verlag, 2001,pp.406-412.

同被引文献51

引证文献9

二级引证文献41

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部