Peer-to-peer (P2P) botnets outperform the traditional Internet relay chat (IRC) botnets in evading detection and they have become a prevailing type of threat to the Internet nowadays.Current methods for detecting P2P ...Peer-to-peer (P2P) botnets outperform the traditional Internet relay chat (IRC) botnets in evading detection and they have become a prevailing type of threat to the Internet nowadays.Current methods for detecting P2P botnets,such as similarity analysis of network behavior and machine-learning based classification,cannot handle the challenges brought about by different network scenarios and botnet variants.We noticed that one important but neglected characteristic of P2P bots is that they periodically send requests to update their peer lists or receive commands from botmasters in the command-and-control (C&C) phase.In this paper,we propose a novel detection model named detection by mining regional periodicity (DMRP),including capturing the event time series,mining the hidden periodicity of host behaviors,and evaluating the mined periodic patterns to identify P2P bot traffic.As our detection model is built based on the basic properties of P2P protocols,it is difficult for P2P bots to avoid being detected as long as P2P protocols are employed in their C&C.For hidden periodicity mining,we introduce the so-called regional periodic pattern mining in a time series and present our algorithms to solve the mining problem.The experimental evaluation on public datasets demonstrates that the algorithms are promising for efficient P2P bot detection in the C&C phase.展开更多
The continuous emerging of peer-to-peer(P2P) applications enriches resource sharing by networks, but it also brings about many challenges to network management. Therefore, P2 P applications monitoring, in particular,P...The continuous emerging of peer-to-peer(P2P) applications enriches resource sharing by networks, but it also brings about many challenges to network management. Therefore, P2 P applications monitoring, in particular,P2 P traffic classification, is becoming increasingly important. In this paper, we propose a novel approach for accurate P2 P traffic classification at a fine-grained level. Our approach relies only on counting some special flows that are appearing frequently and steadily in the traffic generated by specific P2 P applications. In contrast to existing methods, the main contribution of our approach can be summarized as the following two aspects. Firstly, it can achieve a high classification accuracy by exploiting only several generic properties of flows rather than complicated features and sophisticated techniques. Secondly, it can work well even if the classification target is running with other high bandwidth-consuming applications, outperforming most existing host-based approaches, which are incapable of dealing with this situation. We evaluated the performance of our approach on a real-world trace. Experimental results show that P2 P applications can be classified with a true positive rate higher than 97.22% and a false positive rate lower than 2.78%.展开更多
Inverted index traversal techniques have been studied in addressing the query processing performance challenges of web search engines, but still leave much room for improvement. In this paper, we focus on the inverted...Inverted index traversal techniques have been studied in addressing the query processing performance challenges of web search engines, but still leave much room for improvement. In this paper, we focus on the inverted index traversal on document-sorted indexes and the optimization technique called dynamic pruning, which can efficiently reduce the hardware computational resources required. We propose another novel exhaustive index traversal scheme called largest scores first(LSF) retrieval, in which the candidates are first selected in the posting list of important query terms with the largest upper bound scores and then fully scored with the contribution of the remaining query terms. The scheme can effectively reduce the memory consumption of existing term-at-atime(TAAT) and the candidate selection cost of existing document-at-a-time(DAAT) retrieval at the expense of revisiting the posting lists of the remaining query terms. Preliminary analysis and implementation show comparable performance between LSF and the two well-known baselines. To further reduce the number of postings that need to be revisited, we present efficient rank safe dynamic pruning techniques based on LSF, including two important optimizations called list omitting(LSF_LO) and partial scoring(LSF_PS) that make full use of query term importance. Finally, experimental results with the TREC GOV2 collection show that our new index traversal approaches reduce the query latency by almost 27% over the WAND baseline and produce slightly better results compared with the Max Score baseline, while returning the same results as exhaustive evaluation.展开更多
基金Project (Nos.61170286 and 61202486) supported by the National Natural Science Foundation of China
文摘Peer-to-peer (P2P) botnets outperform the traditional Internet relay chat (IRC) botnets in evading detection and they have become a prevailing type of threat to the Internet nowadays.Current methods for detecting P2P botnets,such as similarity analysis of network behavior and machine-learning based classification,cannot handle the challenges brought about by different network scenarios and botnet variants.We noticed that one important but neglected characteristic of P2P bots is that they periodically send requests to update their peer lists or receive commands from botmasters in the command-and-control (C&C) phase.In this paper,we propose a novel detection model named detection by mining regional periodicity (DMRP),including capturing the event time series,mining the hidden periodicity of host behaviors,and evaluating the mined periodic patterns to identify P2P bot traffic.As our detection model is built based on the basic properties of P2P protocols,it is difficult for P2P bots to avoid being detected as long as P2P protocols are employed in their C&C.For hidden periodicity mining,we introduce the so-called regional periodic pattern mining in a time series and present our algorithms to solve the mining problem.The experimental evaluation on public datasets demonstrates that the algorithms are promising for efficient P2P bot detection in the C&C phase.
基金supported by the National Natural Science Foundation of China(Nos.61170286 and 61202486)
文摘The continuous emerging of peer-to-peer(P2P) applications enriches resource sharing by networks, but it also brings about many challenges to network management. Therefore, P2 P applications monitoring, in particular,P2 P traffic classification, is becoming increasingly important. In this paper, we propose a novel approach for accurate P2 P traffic classification at a fine-grained level. Our approach relies only on counting some special flows that are appearing frequently and steadily in the traffic generated by specific P2 P applications. In contrast to existing methods, the main contribution of our approach can be summarized as the following two aspects. Firstly, it can achieve a high classification accuracy by exploiting only several generic properties of flows rather than complicated features and sophisticated techniques. Secondly, it can work well even if the classification target is running with other high bandwidth-consuming applications, outperforming most existing host-based approaches, which are incapable of dealing with this situation. We evaluated the performance of our approach on a real-world trace. Experimental results show that P2 P applications can be classified with a true positive rate higher than 97.22% and a false positive rate lower than 2.78%.
文摘Inverted index traversal techniques have been studied in addressing the query processing performance challenges of web search engines, but still leave much room for improvement. In this paper, we focus on the inverted index traversal on document-sorted indexes and the optimization technique called dynamic pruning, which can efficiently reduce the hardware computational resources required. We propose another novel exhaustive index traversal scheme called largest scores first(LSF) retrieval, in which the candidates are first selected in the posting list of important query terms with the largest upper bound scores and then fully scored with the contribution of the remaining query terms. The scheme can effectively reduce the memory consumption of existing term-at-atime(TAAT) and the candidate selection cost of existing document-at-a-time(DAAT) retrieval at the expense of revisiting the posting lists of the remaining query terms. Preliminary analysis and implementation show comparable performance between LSF and the two well-known baselines. To further reduce the number of postings that need to be revisited, we present efficient rank safe dynamic pruning techniques based on LSF, including two important optimizations called list omitting(LSF_LO) and partial scoring(LSF_PS) that make full use of query term importance. Finally, experimental results with the TREC GOV2 collection show that our new index traversal approaches reduce the query latency by almost 27% over the WAND baseline and produce slightly better results compared with the Max Score baseline, while returning the same results as exhaustive evaluation.