期刊文献+

基于滑动窗口的数据流中近期频繁项挖掘

Mining recent frequent items from a sliding window over data streams
下载PDF
导出
摘要 提出了一种在单独数据流中挖掘近期频繁项的算法MRFI。该算法采用基于对时间敏感的滑动窗口的模式,保证了挖掘结果的时效性,并利用循环队列和二叉排序树实现了简单高效的数据存储和处理,该方法是一种近似算法,它可以消除历史数据对挖掘结果的影响。实验采用IBM数据发生器产生合成数据,证明了该算法的有效性。 A new algorithm is proposed to mining recent frequent items in single data stream,called MRFI.The proposed algorithm works under time-sensitive sliding windows,and guarantees the mining result is recent.We used circular queue and binary sort tree to store and process streaming data that is simple and efficient.The proposed method is an approximate algorithm,it can eliminate the influence of old data to mined result.Based on the IBM test data generator,the experimental results show the feasibility and effectiveness of the algorithm.
作者 刘超 耿蕊
出处 《齐齐哈尔大学学报(自然科学版)》 2010年第3期9-13,共5页 Journal of Qiqihar University(Natural Science Edition)
关键词 数据流 频繁模式 滑动窗口 循环队列 二叉排序树 data stream frequent patterns sliding windows circular queue binary sort tree
  • 相关文献

参考文献7

  • 1金澈清,钱卫宁,周傲英.流数据分析与管理综述[J].软件学报,2004,15(8):1172-1181. 被引量:161
  • 2KARP R,PAPADIMITRIOU C,SHENKER S.A simple algorithm for finding frequent elements in sets and bags[J].Trans on Database Systems,2003,28(1):51-55.
  • 3Kollios G,Gunopoulos D,Koudas N.Efficient Biased Sampling for Approximate Clustering and Outlier Detection in Large Data Sets[J].IEEE Transactions on Knowledge and Data Engineering,2003,15(5):1 170-1 187.
  • 4WU Fan,CHIANG S W,LIN J R.A new approach to mine frequent patterns using item-transformation methods[J].Information Systems,2007,32(7):1 056-1 072.
  • 5XIN Dong,HAN Jia-wei,YAN Xi-feng,et al.On compressing frequent patterns[J].Data & Knowledge Engineering,2007,60(1):5-29.
  • 6邝祝芳,阳国贵,辛动军.SWFPM:一种有效的数据流频繁项挖掘算法[J].计算机应用研究,2009,26(2):466-469. 被引量:4
  • 7程杰.基于二进制的频繁项集挖掘新算法[J].电脑知识与技术,2009,5(5):3486-3488. 被引量:1

二级参考文献58

  • 1王伟平,李建中,张冬冬,郭龙江.一种有效的挖掘数据流近似频繁项算法[J].软件学报,2007,18(4):884-892. 被引量:33
  • 2Babcock B, Babu S, Datar M, Motwani R, Widom J. Models and issues in data streams. In: Popa L, ed. Proc. of the 21st ACM SIGACT-SIGMOD-SIGART Symp. on Principles of Database Systems. Madison: ACM Press, 2002. 1~16.
  • 3Terry D, Goldberg D, Nichols D, Oki B. Continuous queries over append-only databases. SIGMOD Record, 1992,21(2):321-330.
  • 4Avnur R, Hellerstein J. Eddies: Continuously adaptive query processing. In: Chen W, Naughton JF, Bernstein PA, eds. Proc. of the 2000 ACM SIGMOD Int'l Conf. on Management of Data. Dallas: ACM Press, 2000. 261~272.
  • 5Hellerstein J, Franklin M, Chandrasekaran S, Deshpande A, Hildrum K, Madden S, Raman V, Shah MA. Adaptive query processing: Technology in evolution. IEEE Data Engineering Bulletin, 2000,23(2):7-18.
  • 6Carney D, Cetinternel U, Cherniack M, Convey C, Lee S, Seidman G, Stonebraker M, Tatbul N, Zdonik S. Monitoring streams?A new class of DBMS applications. Technical Report, CS-02-01, Providence: Department of Computer Science, Brown University, 2002.
  • 7Guha S, Mishra N, Motwani R, O'Callaghan L. Clustering data streams. In: Blum A, ed. The 41st Annual Symp. on Foundations of Computer Science, FOCS 2000. Redondo Beach: IEEE Computer Society, 2000. 359-366.
  • 8Domingos P, Hulten G. Mining high-speed data streams. In: Ramakrishnan R, Stolfo S, Pregibon D, eds. Proc. of the 6th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. Boston: ACM Press, 2000. 71-80.
  • 9Domingos P, Hulten G, Spencer L. Mining time-changing data streams. In: Provost F, Srikant R, eds. Proc. of the 7th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. San Francisco: ACM Press, 2001. 97~106.
  • 10Zhou A, Cai Z, Wei L, Qian W. M-Kernel merging: Towards density estimation over data streams. In: Cha SK, Yoshikawa M, eds. The 8th Int'l Conf. on Database Systems for Advanced Applications (DASFAA 2003). Kyoto: IEEE Computer Society, 2003. 285~292.

共引文献163

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部