期刊文献+

在线挖掘数据流混合窗口中闭频繁项集 被引量:2

Online Mining Closed Frequent Itemsets in Mixed Window over Data Streams
下载PDF
导出
摘要 在数据流挖掘中,界标窗体考虑了历史模式对当前挖掘的影响,但没考虑到随时间的推移模式衰减的问题。滑动窗口能记录最新、最有用的模式,但窗口的最佳大小无法准确确定。针对一些仿真系统中具有数据流特点的数据,提出了一种挖掘混合窗口中闭频繁项集的方法T-Moment。该方法能在单遍扫描数据流的条件下完整地记录模式信息。同时,T-Moment提出的减枝方法能很好地降低滑动窗口树F-tree的空间复杂度与闭频繁模式树T-tree的维护代价。此外,该方法提出的时间衰减机制能区分历史和最新模式。大量仿真实验结果表明,T-Moment有很好的效率和准确性。 In data mining,boundary window considers the influence of history pattern to the current mining result,but do not think over mode decaying as time passed. Sliding window can record the latest and most useful patterns,but the best size can not be accurately determined. To aim at data with the characteristics of data flow in some simulation systems,a method for mining the closed frequent patterns in the mixed window of data stream was proposed. The pattern of data stream could be completely recorded by scanning the stream only once. And the pruning method of T-Moment could reduce the space complexity of sliding window tree and the maintenance cost of the closed frequent patterns tree. To differentiate the historical and the latest patterns,a time decaying model was applied. The experimental results show that the algorithm has good efficiency and accuracy.
出处 《系统仿真学报》 CAS CSCD 北大核心 2010年第9期2110-2114,2119,共6页 Journal of System Simulation
基金 国家高技术研究发展计划(863)2007AA04Z116 国家自然科学基金70871033~~
关键词 仿真数据 闭频繁模式 混合窗体 时间衰减 simulation data closed frequent pattern mixed window time decaying
  • 相关文献

参考文献13

  • 1敖富江,颜跃进,刘宝宏,黄柯棣.在线挖掘数据流滑动窗口中最大频繁项集[J].系统仿真学报,2009,21(4):1134-1139. 被引量:9
  • 2邝祝芳,阳国贵,辛动军.SWFPM:一种有效的数据流频繁项挖掘算法[J].计算机应用研究,2009,26(2):466-469. 被引量:4
  • 3Mohammed J Zaki, Ching-Jui Hsiao. CHARM: An Efficient Algorithm for Closed Itemset Mining [C]//2nd SIAM Int'l Conf, On Data Mining, 2002.USA: SIAM, 2002: 457-473.
  • 4Jianyong Wang, Jiawei Han, Jian Pei. CLOSET+: Serching for the Best Strategies for Mining Frequent Closed Itemsets [C]// IEEE Computer Society ACM SIGKDD Int'l conf. on Knowledge Discovery and Data Mining. USA: IEEE, August 2003: 236-245.
  • 5Yun Chi, Haixun Wang, Philip S, et al. Moment: Maintaining closed frequent itemsets over a data stream sliding window [C]// Proceedings of the Fourth IEEE International Conference on Data Mining. Brighton, UK: IEEE Press, 2004: 59-66.
  • 6李国徽,陈辉.挖掘数据流任意滑动时间窗口内频繁模式[J].软件学报,2008,19(10):2585-2596. 被引量:45
  • 7Jiang Nan, Gruenwald Le. CFI-Stream: Mining Closed frequent itemsets in data streams [C]// Roberto B, Kristin PB, Gautam D, Dimitrios G, Johannes G.The 12th ACM SIGKDD Int'l Conf on Knowledge and Data Mining. Philadelphia, USA: ACM Press, 2006: 592-597.
  • 8Giannella C, Hart J, Pei J, Yan X, Yu PS. Mining frequent Patterns in data streams at multiple time granularities [C]//Data Mining: Next Generation Challenges and Future Directions, H Kargupta, A Joshi, K Sivakumar, Y Yesha (eds.), Next Generation Data Mining. USA: AAAUMIT, 2003: 191-212.
  • 9Leung CKS, Khan QI. DStree: A tree structure for the mining of frequent sets from data streams [C]// Clifton CW, Zhong N, Liu JM, Wah BW, Wu XD. Proc. of the 6th Int'l Conf on Data Mining, Hong Kong, China. USA: IEEE Press, 2006: 928-932.
  • 10刘学军,徐宏炳,董逸生,钱江波,王永利.基于滑动窗口的数据流闭合频繁模式的挖掘[J].计算机研究与发展,2006,43(10):1738-1743. 被引量:26

二级参考文献41

  • 1王伟平,李建中,张冬冬,郭龙江.一种有效的挖掘数据流近似频繁项算法[J].软件学报,2007,18(4):884-892. 被引量:33
  • 2B Babcock, S Babu, M Datar, R Motwani, J Widom. Models and Issues in Data Stream Systems [C]// Proc. of PODS'2002. USA: ACM, 2002: 1-16.
  • 3D Lee, W Lee. Finding maximal frequent itemscts over online data streams adaptively [C]// Proc. of the Fifth IEEE International Conference on Data Mining. Houston. USA: IEEE, 2005: 266-273.
  • 4H Li, S Lee, M Shan. Online mining (recently) maximal frequent itemsets over data streams [C]//Proc. of the fifteenth International Workshops on Research Issues in Data Engineering: Stream Data Mining and Applications, Tokyo, Japan. USA: IEEE, 2005:11-18.
  • 5G Mao, X Wu, X Zhu, et al. Mining maximal frequent itemsets from data streams [J]. Journal of Information Science, 2007, 33(3): 251-262.
  • 6G Grahne, J Zhu. Efficiently Using Prefix-trees in Mining Frequent Itemsets [C]// Proc. of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations. USA: IEEE, 2003.
  • 7Y Yah, Z Li, H Chen. Fast Mining Maximal Frequent ItemSets Based on FP-Tree [C]//Proc. of AI'2004, Cairns Australia, December, 2004. Germany: Springer, 2004: 475-487.
  • 8F Ao, Y Yan, J Huang, K Huang. A Novel Pruning Technique for Mining Maximal Frequent Itemsets [C]// Proc. of FSKD'2007, Haikou, China, August, 2007. USA: IEEE, 2007:469-473.
  • 9Y Zhu, D Shasha. StatStream: Statistical monitoring of thousands of data streams in real time [C]//Proc. of the 28th Int'l Conf. on Very Large Data Bases. Hong Kong: Morgan Kaufmann, 2002: 358-369.
  • 10J Han, J Pei, Y Yin. Mining frequent patterns without candidate generation [C]//Proc. of the Special Interest Group on Management of Data 2000. USA: ACM, 2000: 1-12.

共引文献73

同被引文献17

  • 1姜卯生,王浩,姚宏亮.朴素贝叶斯分类器增量学习序列算法研究[J].计算机工程与应用,2004,40(14):57-59. 被引量:10
  • 2刘学军,徐宏炳,董逸生,钱江波,王永利.基于滑动窗口的数据流闭合频繁模式的挖掘[J].计算机研究与发展,2006,43(10):1738-1743. 被引量:26
  • 3鞠儒生,乔海泉,黄柯棣.基于数据挖掘的HLA仿真系统测试与评估[J].系统工程与电子技术,2006,28(10):1599-1602. 被引量:6
  • 4IBM Research - Almaden E EB/OL]. http..//www almaden, ibm. com.
  • 5Frequent Itemset Mining Dataset Repository[EB/OL] http://fimi, cs. helsinki, fi/data.
  • 6Webb G I. Discovering signi? cant patterns [J] Machine Learning, 2007, 68(1): 1-33.
  • 7Zaki M J, Hsiao C J. CHARM.. An efficient algorithm for closed itemset mining [-C]//Proceedings of the 2nd SIAM International Conference on Data Mining. Arlington, USA: IEEE Computer Society, 2002: 457-473.
  • 8Chi Y, Wang H X, Philip S, et al. Moment.- Maintaining closed frequent itemsets over a data stream sliding window [C]//Proeeedings of the 4th IEEE International Conference on Data Mining. Brighton, UK.. IEEE Press, 2004.. 59-66.
  • 9Leung C K S, Khan Q L DStree: A tree structure for the mining of frequent sets from data streams [C]// Proceedings of the 6th International Conference on Data Mining. Hong Kong, China: IEEE Press, 2006:928-932.
  • 10Jiang N, Gruenwald L. CFFstream: mining closed frequent itemsets in data streams [C]//Proceedings of the 12th ACM SIGKDD International Conference on Knowledge and Data Mining. Philadelphia, USA: ACM Press, 2006.- 592-597.

引证文献2

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部