In order to improve the precision of super point detection and control measurement resource consumption, this paper proposes a super point detection method based on sampling and data streaming algorithms (SDSD), and...In order to improve the precision of super point detection and control measurement resource consumption, this paper proposes a super point detection method based on sampling and data streaming algorithms (SDSD), and proves that only sources or destinations with a lot of flows can be sampled probabilistically using the SDSD algorithm. The SDSD algorithm uses both the IP table and the flow bloom filter (BF) data structures to maintain the IP and flow information. The IP table is used to judge whether an IP address has been recorded. If the IP exists, then all its subsequent flows will be recorded into the flow BF; otherwise, the IP flow is sampled. This paper also analyzes the accuracy and memory requirements of the SDSD algorithm , and tests them using the CERNET trace. The theoretical analysis and experimental tests demonstrate that the most relative errors of the super points estimated by the SDSD algorithm are less than 5%, whereas the results of other algorithms are about 10%. Because of the BF structure, the SDSD algorithm is also better than previous algorithms in terms of memory consumption.展开更多
本文提出使用语义增强的Counting B loom FilterReconstruction(RSECBF)算法来快速还原源串或给出源串的聚类特征.它给每个哈希函数独立的哈希映射空间以消除哈希函数的内部冲突;扩展哈希函数使其不受均匀性限制,使得哈希函数可以带有语...本文提出使用语义增强的Counting B loom FilterReconstruction(RSECBF)算法来快速还原源串或给出源串的聚类特征.它给每个哈希函数独立的哈希映射空间以消除哈希函数的内部冲突;扩展哈希函数使其不受均匀性限制,使得哈希函数可以带有语义;利用哈希串的重叠和数量一致性来解决同源哈希串拼接成源串的问题,为源串的还原创造了条件.本文针对Pareto分布的哈希函数,为主成分的还原提出了一个简洁的源串还原算法.对于直接选择部分比特的哈希映射而言,如果主成分分析中的RSECBF不能还原出源串,则还原出来的最长串就是源串的聚类特征.仿真及实际检验表明,B loom Filter可以扩展其哈希函数来实现语义增强,RSECBF还原的结果是可信的.本算法可以在异常行为发生的时候挖掘网络行为特征.展开更多
基金The National Basic Research Program of China(973Program)(No.2009CB320505)the Natural Science Foundation of Jiangsu Province(No. BK2008288)+1 种基金the Excellent Young Teachers Program of Southeast University(No.4009001018)the Open Research Program of Key Laboratory of Computer Network of Guangdong Province (No. CCNL200706)
文摘In order to improve the precision of super point detection and control measurement resource consumption, this paper proposes a super point detection method based on sampling and data streaming algorithms (SDSD), and proves that only sources or destinations with a lot of flows can be sampled probabilistically using the SDSD algorithm. The SDSD algorithm uses both the IP table and the flow bloom filter (BF) data structures to maintain the IP and flow information. The IP table is used to judge whether an IP address has been recorded. If the IP exists, then all its subsequent flows will be recorded into the flow BF; otherwise, the IP flow is sampled. This paper also analyzes the accuracy and memory requirements of the SDSD algorithm , and tests them using the CERNET trace. The theoretical analysis and experimental tests demonstrate that the most relative errors of the super points estimated by the SDSD algorithm are less than 5%, whereas the results of other algorithms are about 10%. Because of the BF structure, the SDSD algorithm is also better than previous algorithms in terms of memory consumption.