期刊文献+

基于散列和计数方法的网络流频繁项挖掘算法 被引量:2

Frequent items mining algorithm over network flows based on the combination of hash method and counting method
原文传递
导出
摘要 在分析基于计数的流频繁项挖掘算法的优缺点后,针对网络流的实际特性,提出了基于散列方法和计数方法的网络流频繁项挖掘(CBFTSFIM)算法.算法首先采用改进的计数型布鲁姆过滤器(CBF)在不用保存网络流信息的情况下过滤掉部分非频繁项流,使得需要进一步处理的流数目大为减少;然后采用基于时间和流长约束的频繁项挖掘(TSFIM)算法实现流频繁项提取.实际流量数据测试表明:CBFTSFIM算法具有非常高的空间利用率,其在流频繁项提取、流长统计效果上明显优于空间节约计数(SS)等算法. The advantage and deficiency of counting method for frequent items mining over data streams were discussed at first. Then, an efficient frequent items mining algorithm CBF-TSFIM (counting blooming filter and time-space based frequent items mining) over network flows was pro- posed based on the combination of hash method and counting method according to the property of net- work flows. The algorithm CBF_ TSFIM improved the counting blooming filter (CBF) to filter some infrequent items and used TSFIM (time-space based frequent items mining) to identify frequent items. The experiment over real network traffic shows that CBF_ TSFIM is very space-saving and much more accurate than other algorithms like SS (space saving) in the criterion of frequent items identifying and flow length counting.
出处 《华中科技大学学报(自然科学版)》 EI CAS CSCD 北大核心 2013年第9期57-62,共6页 Journal of Huazhong University of Science and Technology(Natural Science Edition)
基金 陕西省自然科学基金资助项目(2012JZ8005)
关键词 网络流 数据挖掘 散列方法 频繁项 计数方法 计数型布鲁姆过滤器(CBF) network flows data mining hash method frequent item counting method countingblooming filter (CBF)
  • 相关文献

参考文献5

二级参考文献79

共引文献63

同被引文献26

  • 1张玉,方滨兴,张永铮.高速网络监控中大流量对象的识别[J].中国科学:信息科学,2010,40(2):340-355. 被引量:11
  • 2王风宇,云晓春,王晓峰,王勇.高速网络监控中大流量对象的提取[J].软件学报,2007,18(12):3060-3070. 被引量:22
  • 3Hyunsang C, Heejo L. Identifying botnets by capturing group activities in DNS traffic[J]. Computer Networks, 2012, 56(1): 20-33.
  • 4Estan C, Varghese G. New directions in traffic measurement and accounting: focusing on the elephants, ignoring the mice[J]. ACM Transactions on Computer Systems, 2003, 21(3): 270-313.
  • 5Manku G S, Motwani R. Approximate frequency counts over data streams[C]//Proc of the 28th International Conference on Very Large Data Bases, Hong Kong, 2002:346-357.
  • 6Cormode G, Muthukrishnan S. What's hot and what's not: tracking most frequent items dynamically[J]. ACM Transactions on Database Systems, 2005, 30(1): 249-278.
  • 7Karp R M, Shenker S, Papadimitriou C H. A simple algorithm for finding frequent elements in streams and bags[J]. ACM Transactions on Database Systems, 2003, 28(1): 51-55.
  • 8Metwally A, Agrawal D, Abbadi A E. Efficient computation of frequent and Top-k elements in data streams //Proc. of the International Conference on Data Theory. Edinburgh: Springer-Verlag, 2005:398-412.
  • 9Cormode G, Hadjieleftheriou M. Finding the frequent items in streams of data[J]. Communications of ACM, 2009, 52(10): 97-105.
  • 10Liu H Y, Lin Y, Han J W. Methods for mining frequent items in data streams: an overview[J]. Knowledge and Information System, 2011, 26(1): 1-30.

引证文献2

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部