Attacks on websites and network servers are among the most critical threats in network security.Network behavior identification is one of the most effective ways to identify malicious network intrusions.Analyzing abno...Attacks on websites and network servers are among the most critical threats in network security.Network behavior identification is one of the most effective ways to identify malicious network intrusions.Analyzing abnormal network traffic patterns and traffic classification based on labeled network traffic data are among the most effective approaches for network behavior identification.Traditional methods for network traffic classification utilize algorithms such as Naive Bayes,Decision Tree and XGBoost.However,network traffic classification,which is required for network behavior identification,generally suffers from the problem of low accuracy even with the recently proposed deep learning models.To improve network traffic classification accuracy thus improving network intrusion detection rate,this paper proposes a new network traffic classification model,called ArcMargin,which incorporates metric learning into a convolutional neural network(CNN)to make the CNN model more discriminative.ArcMargin maps network traffic samples from the same category more closely while samples from different categories are mapped as far apart as possible.The metric learning regularization feature is called additive angular margin loss,and it is embedded in the object function of traditional CNN models.The proposed ArcMargin model is validated with three datasets and is compared with several other related algorithms.According to a set of classification indicators,the ArcMargin model is proofed to have better performances in both network traffic classification tasks and open-set tasks.Moreover,in open-set tasks,the ArcMargin model can cluster unknown data classes that do not exist in the previous training dataset.展开更多
Network traffic classification,which matches network traffic for a specific class of different granularities,plays a vital role in the domain of network administration and cyber security.With the rapid development of ...Network traffic classification,which matches network traffic for a specific class of different granularities,plays a vital role in the domain of network administration and cyber security.With the rapid development of network communication techniques,more and more network applications adopt encryption techniques during communication,which brings significant challenges to traditional network traffic classification methods.On the one hand,traditional methods mainly depend on matching features on the application layer of the ISO/OSI reference model,which leads to the failure of classifying encrypted traffic.On the other hand,machine learning-based methods require human-made features from network traffic data by human experts,which renders it difficult for them to deal with complex network protocols.In this paper,the convolution attention network(CAT)is proposed to overcom those difficulties.As an end-to-end model,CAT takes raw data as input and returns classification results automatically,with engineering by human experts.In CAT,firstly,the importance of different bytes with an attention mechanism of network traffic is achieved.Then,convolution neural network(CNN)is used to learn features automatically and feed the output into a softmax function to get classification results.It enables CAT to learn enough information from network traffic data and ensure the classified accuracy.Extensive experiments on the public encrypted network traffic dataset ISCX2016 demonstrate the effectiveness of the proposed model.展开更多
Network traffic classification aims at identifying the application types of network packets. It is important for Internet service providers (ISPs) to manage bandwidth resources and ensure the quality of service for ...Network traffic classification aims at identifying the application types of network packets. It is important for Internet service providers (ISPs) to manage bandwidth resources and ensure the quality of service for different network applications However, most classification techniques using machine learning only focus on high flow accuracy and ignore byte accuracy. The classifier would obtain low classification performance for elephant flows as the imbalance between elephant flows and mice flows on Internet. The elephant flows, however, consume much more bandwidth than mice flows. When the classifier is deployed for traffic policing, the network management system cannot penalize elephant flows and avoid network congestion effectively. This article explores the factors related to low byte accuracy, and secondly, it presents a new traffic classification method to improve byte accuracy at the aid of data cleaning. Experiments are carried out on three groups of real-world traffic datasets, and the method is compared with existing work on the performance of improving byte accuracy. Experiment shows that byte accuracy increased by about 22.31% on average. The method outperforms the existing one in most cases.展开更多
Classification of network traffic using port-based or payload-based analysis is becoming increasingly difficult when many applications use dynamic port numbers, masquerading techniques, and encryption to avoid detecti...Classification of network traffic using port-based or payload-based analysis is becoming increasingly difficult when many applications use dynamic port numbers, masquerading techniques, and encryption to avoid detection. In this article, an approach is presented for online traffic classification relying on the observation of the first n packets of a transmission control protocol (TCP) connection. Its key idea is to utilize the properties of the observed first ten packets of a TCP connection and Bayesian network method to build a classifier. This classifier can classify TCP flows dynamically as packets pass through it by deciding whether a TCP flow belongs to a given application. The experimental results show that the proposed approach performs well in online Internet traffic classification and that it is superior to naive Bayesian method.展开更多
基金This work was supported by the National Natural Science Foundation of China(61871046).
文摘Attacks on websites and network servers are among the most critical threats in network security.Network behavior identification is one of the most effective ways to identify malicious network intrusions.Analyzing abnormal network traffic patterns and traffic classification based on labeled network traffic data are among the most effective approaches for network behavior identification.Traditional methods for network traffic classification utilize algorithms such as Naive Bayes,Decision Tree and XGBoost.However,network traffic classification,which is required for network behavior identification,generally suffers from the problem of low accuracy even with the recently proposed deep learning models.To improve network traffic classification accuracy thus improving network intrusion detection rate,this paper proposes a new network traffic classification model,called ArcMargin,which incorporates metric learning into a convolutional neural network(CNN)to make the CNN model more discriminative.ArcMargin maps network traffic samples from the same category more closely while samples from different categories are mapped as far apart as possible.The metric learning regularization feature is called additive angular margin loss,and it is embedded in the object function of traditional CNN models.The proposed ArcMargin model is validated with three datasets and is compared with several other related algorithms.According to a set of classification indicators,the ArcMargin model is proofed to have better performances in both network traffic classification tasks and open-set tasks.Moreover,in open-set tasks,the ArcMargin model can cluster unknown data classes that do not exist in the previous training dataset.
基金This work was supported by the State Grid Science and Technology Project Research on Key Technologies and Applications of Self-Service Big Data Governance of Power Grid(5442YD180015).
文摘Network traffic classification,which matches network traffic for a specific class of different granularities,plays a vital role in the domain of network administration and cyber security.With the rapid development of network communication techniques,more and more network applications adopt encryption techniques during communication,which brings significant challenges to traditional network traffic classification methods.On the one hand,traditional methods mainly depend on matching features on the application layer of the ISO/OSI reference model,which leads to the failure of classifying encrypted traffic.On the other hand,machine learning-based methods require human-made features from network traffic data by human experts,which renders it difficult for them to deal with complex network protocols.In this paper,the convolution attention network(CAT)is proposed to overcom those difficulties.As an end-to-end model,CAT takes raw data as input and returns classification results automatically,with engineering by human experts.In CAT,firstly,the importance of different bytes with an attention mechanism of network traffic is achieved.Then,convolution neural network(CNN)is used to learn features automatically and feed the output into a softmax function to get classification results.It enables CAT to learn enough information from network traffic data and ensure the classified accuracy.Extensive experiments on the public encrypted network traffic dataset ISCX2016 demonstrate the effectiveness of the proposed model.
基金supported by the National Basic Research Program of China(2009CB320505)
文摘Network traffic classification aims at identifying the application types of network packets. It is important for Internet service providers (ISPs) to manage bandwidth resources and ensure the quality of service for different network applications However, most classification techniques using machine learning only focus on high flow accuracy and ignore byte accuracy. The classifier would obtain low classification performance for elephant flows as the imbalance between elephant flows and mice flows on Internet. The elephant flows, however, consume much more bandwidth than mice flows. When the classifier is deployed for traffic policing, the network management system cannot penalize elephant flows and avoid network congestion effectively. This article explores the factors related to low byte accuracy, and secondly, it presents a new traffic classification method to improve byte accuracy at the aid of data cleaning. Experiments are carried out on three groups of real-world traffic datasets, and the method is compared with existing work on the performance of improving byte accuracy. Experiment shows that byte accuracy increased by about 22.31% on average. The method outperforms the existing one in most cases.
基金supported by the National Basic Research Program of China(2007CB310705)the Hi-Tech Research and Development Program of China(2007AA01Z255)+2 种基金the National Natural Science Foundation of China(60711140087)PCSIRT(IRT0609)ISTCP(2006DFA 11040) of China
文摘Classification of network traffic using port-based or payload-based analysis is becoming increasingly difficult when many applications use dynamic port numbers, masquerading techniques, and encryption to avoid detection. In this article, an approach is presented for online traffic classification relying on the observation of the first n packets of a transmission control protocol (TCP) connection. Its key idea is to utilize the properties of the observed first ten packets of a TCP connection and Bayesian network method to build a classifier. This classifier can classify TCP flows dynamically as packets pass through it by deciding whether a TCP flow belongs to a given application. The experimental results show that the proposed approach performs well in online Internet traffic classification and that it is superior to naive Bayesian method.