期刊文献+

在线挖掘数据流闭合频繁项集CMNL-SW算法 被引量:2

CMNL-SW Algorithm on Online Mining Closed FrequentItemsets Over Data Stream
下载PDF
导出
摘要 提出了一种新的CMNL-SW(Closed map and num list-sliding window)挖掘算法。具体使用数据结构Closedmap存储挖掘到的闭合项集和Num list存储所有不同项的序号,通过对添加新事务和删除旧事务包含的项序号进行简单的并集和该事务与之相关已经挖掘到的闭合项集进行交集运算来更新当前滑动窗口,使之能够根据用户任意指定的支持度阈值在线输出数据流上闭合频繁项集信息。通过理论分析和对真实数据集Mushroom,Retail-chain和人工合成数据集T40I10D100K的挖掘结果表明,提出的算法在时空效率上明显优于同类经典算法Moment和CFI-Stream,并且随着数据流上处理事务数的递增和快速改变表现出良好的稳定性。 A new online mining algorithm called the closed map and num list-sliding window (CMNL-SW) is proposed. It uses two data structures, i.e. closed map stores, the closed itemsets, those are mined and the num list stores the number of all different items. Via the simple union operation on item number contained within a new arriving or an old deleting transaction and the intersection operation on certain previous closed itemsets once, it incrementally updates the current sliding window and makes the closed frequent itemsets be output in real time based on the specified thresholds of any user. Theoretical analysis and experimental results of the real datasets, such as mushroom, retail-chain and artificially synthesized datasets T40110D100K show that the proposed method is superior to the classic algorithms Moment and CFI-Stream in terms of time and space efficiencies, and it has good stability as the number of transactions processed increases and adapts rapidly to the change in data streams.
出处 《数据采集与处理》 CSCD 北大核心 2012年第4期508-513,共6页 Journal of Data Acquisition and Processing
关键词 挖掘算法 闭合频繁项集 滑动窗口 数据流 mining algorithm closed frequent itemsets sliding window data stream
  • 相关文献

参考文献10

  • 1Giannella C, Han J, Pei J, et al. Mining frequent patterns in data streams at multiple time granulari- ties [C]//Data Mining: Next Generation Challenges and Future Directions. IS. 1. ]:AAAI/MIT, 2003: 191-212.
  • 2敖富江,杜静,颜跃进,黄柯棣.在线挖掘数据流滑动窗口中频繁闭项集[J].系统工程与电子技术,2009,31(5):1235-1240. 被引量:2
  • 3Nicolas P, Bastide Y, Lakhal L. Discovering fre- quent closed itemsets for association rules [C]//Database Theory--ICDT'99. Berlin Heidelberg: Springer-Verlag, 1999 : 398-416.
  • 4Wang J, Han J, Pei J. CLOSET q-.- Searching for the best strategies for mining frequent closed item- sets [C]//SIGKDD'03. NY, USA.. ACM, 2003.. 236-245.
  • 5Mohammed J Zaki, Hsiao C J. CHARM: An effi- cient algorithm for closed itemset mining[C]//Pro- ceedings of the 2nd SIAM International Conference on Data Mining. Amsterdam, the Netherlands :IOS, 2002:457-473.
  • 6Mao Yinmin, Yang Lumin, Li Hong. Mining closed frequent itemsets in the sliding window over data stream [C]//Information Computing and Telecom- munication. Beijing, China : rs. n. ], 2009:146-149.
  • 7Chi Y, Wang H, Yu P S, et al. Moment: maintain- ing closed frequent itemset over a stream sliding win- dows[C]//Proc of ICDM. Brighton, UK: Is. n. ], 2004: 59-66.
  • 8Li Hua Fu, Ho C C. A new algorithm for maintain- ing closed frequent itemsets in data streams by incre- mental updates [C]//Proceeding of IWMESD'06. Hong Kong.. [s. n. 7,2006 : 672-676.
  • 9Nan J, Gruenwald L. CFI-Stream: mining closed frequent itemsets in data streams[C]//Proceeding of KDD. NY,USA:ACM,2006: 592-597.
  • 10Li Hua Fu, Ho C C, Lee S Y. Incremental updates of closed frequent itemsets over continuous data streams[J]. Expert Systems with Applications, 20019 (36) : 2451-2458.

二级参考文献14

  • 1刘学军,徐宏炳,董逸生,钱江波,王永利.基于滑动窗口的数据流闭合频繁模式的挖掘[J].计算机研究与发展,2006,43(10):1738-1743. 被引量:26
  • 2刘旭,毛国君,孙岳,刘椿年.数据流中频繁闭项集的近似挖掘算法[J].电子学报,2007,35(5):900-905. 被引量:14
  • 3Chi Y, Wang H, Yu PS, et al. Moment: maintaining closed frequent itemsets over a stream sliding window[C]//Proc. of ICDM, 2004:59-66.
  • 4Li HF, Ho CC, Kuo FF, et al. A new algorithm for maintaining closed frequent itemsets in data streams by incremental updates[C]//Proc. of IWMESD, Hong Kong, 2006.
  • 5Nan J, Gruenwald L. CFI stream: mining closed frequent itemsets in data streams[C]//Proc.of KDD, 2006 : 592 - 597.
  • 6Li HF, Lee S, Shan M. Online mining (Recently) maximal frequent itemsets over data streams[C]//Proc.of RIDE-SDMA, Tokyo, Japan: IEEE Press, 2005:11-18.
  • 7Chang J, Lee WS. Finding recent frequent itemsets adaptively o ver online data streams[C]//Proc, of KDD '03, Washington, USA : ACM Press, 2003 : 487 - 492.
  • 8Ao FJ, Yan YJ, et al. Mining maximal frequent itemsets in data streams based on FP-Tree[C]//Proc. of the 5th Int. Conf. on Machine Learning and Data Mining, Leipzig, German, July, 2007:479 - 489.
  • 9Zhu Y, Shasha D. Statstream: statistical monitoring of thousands of data streams in real time[C]//Proc, of VLDB'2002. Hong Kong : Morgan Kaufmann, 2002:358 - 369.
  • 10Grahne G, Zhu JF. Efficiently using prefix-trees in mining frequent itemsets[C]// Proc. of FIMI , 2003.

共引文献1

同被引文献30

  • 1丁艳辉,王洪国,高明,谷建军.一种发现有价值的稀有数据关联规则的算法[J].山东师范大学学报(自然科学版),2005,20(4):17-19. 被引量:1
  • 2Li Zhenhui,Han Jiawei,Ji Ming,et al.Movemine:Mining moving object data for discovery of animal movement patterns[J].ACM Transactions on Intelligent Systems and Technology,2011,2(4):1-37.
  • 3Li Zhenhui,Ding Bolin,Han Jiawei,et al.Mining periodic behaviors for moving objects[C]//KDD' 10 Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York,USA:ACM,2010:1099-1108.
  • 4Rudolf W.The basic theorem of triadic concept analysis[J].SpringerLink,1995,12(2):149-158.
  • 5Nicolas P,Yves B,Rafik T,et al.Efficient mining of association rules using closed itemset lattices[J].Information Systems,1999,24 (1):25-46.
  • 6Patricia L C,Aurelie B,Alexander T,et al.Debugging embedded multimedia application traces through periodic pattern mining[C]//EMSOFT' 12 Proceedings of the Tenth ACM International Conference on Embedded Software.New York,USA:ACM,2012:13-22.
  • 7Uday T K,Krishna P R.An alternative inerestingness measure for mining periodic-frequent patterns[J].SpringerLink,2011,6587:183-192.
  • 8Loic C,Jeremy B,Celine R,et al.Closed patterns meet n-ary relations[J].ACM Transactions on Knowledge Discovery from Data,2009,3(1):1-36.
  • 9Agrawal R, Imielinski T, Swami A. Mining association rules between sets of items in large database[C]//Proceeding of 1993 ACM SIGMOD International Conference on Management of Data. N Y, USA: ACM,1993: 207-216.
  • 10Han J, Pei J, Yin Y. Mining frequent patterns without candidate generation[C]//Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data. Dallas, USA: ACM, 2000:1-12.

引证文献2

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部