Attacks on websites and network servers are among the most critical threats in network security.Network behavior identification is one of the most effective ways to identify malicious network intrusions.Analyzing abno...Attacks on websites and network servers are among the most critical threats in network security.Network behavior identification is one of the most effective ways to identify malicious network intrusions.Analyzing abnormal network traffic patterns and traffic classification based on labeled network traffic data are among the most effective approaches for network behavior identification.Traditional methods for network traffic classification utilize algorithms such as Naive Bayes,Decision Tree and XGBoost.However,network traffic classification,which is required for network behavior identification,generally suffers from the problem of low accuracy even with the recently proposed deep learning models.To improve network traffic classification accuracy thus improving network intrusion detection rate,this paper proposes a new network traffic classification model,called ArcMargin,which incorporates metric learning into a convolutional neural network(CNN)to make the CNN model more discriminative.ArcMargin maps network traffic samples from the same category more closely while samples from different categories are mapped as far apart as possible.The metric learning regularization feature is called additive angular margin loss,and it is embedded in the object function of traditional CNN models.The proposed ArcMargin model is validated with three datasets and is compared with several other related algorithms.According to a set of classification indicators,the ArcMargin model is proofed to have better performances in both network traffic classification tasks and open-set tasks.Moreover,in open-set tasks,the ArcMargin model can cluster unknown data classes that do not exist in the previous training dataset.展开更多
Network traffic classification,which matches network traffic for a specific class of different granularities,plays a vital role in the domain of network administration and cyber security.With the rapid development of ...Network traffic classification,which matches network traffic for a specific class of different granularities,plays a vital role in the domain of network administration and cyber security.With the rapid development of network communication techniques,more and more network applications adopt encryption techniques during communication,which brings significant challenges to traditional network traffic classification methods.On the one hand,traditional methods mainly depend on matching features on the application layer of the ISO/OSI reference model,which leads to the failure of classifying encrypted traffic.On the other hand,machine learning-based methods require human-made features from network traffic data by human experts,which renders it difficult for them to deal with complex network protocols.In this paper,the convolution attention network(CAT)is proposed to overcom those difficulties.As an end-to-end model,CAT takes raw data as input and returns classification results automatically,with engineering by human experts.In CAT,firstly,the importance of different bytes with an attention mechanism of network traffic is achieved.Then,convolution neural network(CNN)is used to learn features automatically and feed the output into a softmax function to get classification results.It enables CAT to learn enough information from network traffic data and ensure the classified accuracy.Extensive experiments on the public encrypted network traffic dataset ISCX2016 demonstrate the effectiveness of the proposed model.展开更多
Network traffic classification aims at identifying the application types of network packets. It is important for Internet service providers (ISPs) to manage bandwidth resources and ensure the quality of service for ...Network traffic classification aims at identifying the application types of network packets. It is important for Internet service providers (ISPs) to manage bandwidth resources and ensure the quality of service for different network applications However, most classification techniques using machine learning only focus on high flow accuracy and ignore byte accuracy. The classifier would obtain low classification performance for elephant flows as the imbalance between elephant flows and mice flows on Internet. The elephant flows, however, consume much more bandwidth than mice flows. When the classifier is deployed for traffic policing, the network management system cannot penalize elephant flows and avoid network congestion effectively. This article explores the factors related to low byte accuracy, and secondly, it presents a new traffic classification method to improve byte accuracy at the aid of data cleaning. Experiments are carried out on three groups of real-world traffic datasets, and the method is compared with existing work on the performance of improving byte accuracy. Experiment shows that byte accuracy increased by about 22.31% on average. The method outperforms the existing one in most cases.展开更多
The Deep Packet Inspection(DPI)method is a popular method that can accurately identify the flow data and its corresponding application.Currently,the DPI method is widely used in common network management systems.Howev...The Deep Packet Inspection(DPI)method is a popular method that can accurately identify the flow data and its corresponding application.Currently,the DPI method is widely used in common network management systems.However,the major limitation of DPI systems is that their signature library is mainly extracted manually,which makes it hard to efficiently obtain the signature of new applications.Hence,in this paper,we propose an automatic signature extraction mechanism using Principal Component Analysis(PCA)technology,which is able to extract the signature automatically.In the proposed method,the signatures are expressed in the form of serial consistent sequences constructed by principal components instead of normally separated substrings in the original data extracted from the traditional methods.Extensive experiments based on numerous sets of data have been carried out to evaluate the performance of the proposed scheme,and the results prove that the newly proposed method can achieve good performance in terms of accuracy and efficiency.展开更多
Classification of network traffic using port-based or payload-based analysis is becoming increasingly difficult when many applications use dynamic port numbers, masquerading techniques, and encryption to avoid detecti...Classification of network traffic using port-based or payload-based analysis is becoming increasingly difficult when many applications use dynamic port numbers, masquerading techniques, and encryption to avoid detection. In this article, an approach is presented for online traffic classification relying on the observation of the first n packets of a transmission control protocol (TCP) connection. Its key idea is to utilize the properties of the observed first ten packets of a TCP connection and Bayesian network method to build a classifier. This classifier can classify TCP flows dynamically as packets pass through it by deciding whether a TCP flow belongs to a given application. The experimental results show that the proposed approach performs well in online Internet traffic classification and that it is superior to naive Bayesian method.展开更多
基金This work was supported by the National Natural Science Foundation of China(61871046).
文摘Attacks on websites and network servers are among the most critical threats in network security.Network behavior identification is one of the most effective ways to identify malicious network intrusions.Analyzing abnormal network traffic patterns and traffic classification based on labeled network traffic data are among the most effective approaches for network behavior identification.Traditional methods for network traffic classification utilize algorithms such as Naive Bayes,Decision Tree and XGBoost.However,network traffic classification,which is required for network behavior identification,generally suffers from the problem of low accuracy even with the recently proposed deep learning models.To improve network traffic classification accuracy thus improving network intrusion detection rate,this paper proposes a new network traffic classification model,called ArcMargin,which incorporates metric learning into a convolutional neural network(CNN)to make the CNN model more discriminative.ArcMargin maps network traffic samples from the same category more closely while samples from different categories are mapped as far apart as possible.The metric learning regularization feature is called additive angular margin loss,and it is embedded in the object function of traditional CNN models.The proposed ArcMargin model is validated with three datasets and is compared with several other related algorithms.According to a set of classification indicators,the ArcMargin model is proofed to have better performances in both network traffic classification tasks and open-set tasks.Moreover,in open-set tasks,the ArcMargin model can cluster unknown data classes that do not exist in the previous training dataset.
基金This work was supported by the State Grid Science and Technology Project Research on Key Technologies and Applications of Self-Service Big Data Governance of Power Grid(5442YD180015).
文摘Network traffic classification,which matches network traffic for a specific class of different granularities,plays a vital role in the domain of network administration and cyber security.With the rapid development of network communication techniques,more and more network applications adopt encryption techniques during communication,which brings significant challenges to traditional network traffic classification methods.On the one hand,traditional methods mainly depend on matching features on the application layer of the ISO/OSI reference model,which leads to the failure of classifying encrypted traffic.On the other hand,machine learning-based methods require human-made features from network traffic data by human experts,which renders it difficult for them to deal with complex network protocols.In this paper,the convolution attention network(CAT)is proposed to overcom those difficulties.As an end-to-end model,CAT takes raw data as input and returns classification results automatically,with engineering by human experts.In CAT,firstly,the importance of different bytes with an attention mechanism of network traffic is achieved.Then,convolution neural network(CNN)is used to learn features automatically and feed the output into a softmax function to get classification results.It enables CAT to learn enough information from network traffic data and ensure the classified accuracy.Extensive experiments on the public encrypted network traffic dataset ISCX2016 demonstrate the effectiveness of the proposed model.
基金supported by the National Basic Research Program of China(2009CB320505)
文摘Network traffic classification aims at identifying the application types of network packets. It is important for Internet service providers (ISPs) to manage bandwidth resources and ensure the quality of service for different network applications However, most classification techniques using machine learning only focus on high flow accuracy and ignore byte accuracy. The classifier would obtain low classification performance for elephant flows as the imbalance between elephant flows and mice flows on Internet. The elephant flows, however, consume much more bandwidth than mice flows. When the classifier is deployed for traffic policing, the network management system cannot penalize elephant flows and avoid network congestion effectively. This article explores the factors related to low byte accuracy, and secondly, it presents a new traffic classification method to improve byte accuracy at the aid of data cleaning. Experiments are carried out on three groups of real-world traffic datasets, and the method is compared with existing work on the performance of improving byte accuracy. Experiment shows that byte accuracy increased by about 22.31% on average. The method outperforms the existing one in most cases.
基金supported by the National Natural Science Foundation of China under Grant No.61003282Beijing Higher Education Young Elite Teacher Project+3 种基金China Next Generation Internet(CNGI)Project"Research and Trial on Evolving Next Generation Network Intelligence Capability Enhancement(NICE)"the National Basic Research Program(973 Program)under Grant No.2009CB320-505the National Science and Technology Major Project"Research about Architecture of Mobile Internet"under Grant No.2011ZX03-002-001-01the National High Technology Research and Development Program(863 Program)under Grant No.2011AA010704
文摘The Deep Packet Inspection(DPI)method is a popular method that can accurately identify the flow data and its corresponding application.Currently,the DPI method is widely used in common network management systems.However,the major limitation of DPI systems is that their signature library is mainly extracted manually,which makes it hard to efficiently obtain the signature of new applications.Hence,in this paper,we propose an automatic signature extraction mechanism using Principal Component Analysis(PCA)technology,which is able to extract the signature automatically.In the proposed method,the signatures are expressed in the form of serial consistent sequences constructed by principal components instead of normally separated substrings in the original data extracted from the traditional methods.Extensive experiments based on numerous sets of data have been carried out to evaluate the performance of the proposed scheme,and the results prove that the newly proposed method can achieve good performance in terms of accuracy and efficiency.
基金supported by the National Basic Research Program of China(2007CB310705)the Hi-Tech Research and Development Program of China(2007AA01Z255)+2 种基金the National Natural Science Foundation of China(60711140087)PCSIRT(IRT0609)ISTCP(2006DFA 11040) of China
文摘Classification of network traffic using port-based or payload-based analysis is becoming increasingly difficult when many applications use dynamic port numbers, masquerading techniques, and encryption to avoid detection. In this article, an approach is presented for online traffic classification relying on the observation of the first n packets of a transmission control protocol (TCP) connection. Its key idea is to utilize the properties of the observed first ten packets of a TCP connection and Bayesian network method to build a classifier. This classifier can classify TCP flows dynamically as packets pass through it by deciding whether a TCP flow belongs to a given application. The experimental results show that the proposed approach performs well in online Internet traffic classification and that it is superior to naive Bayesian method.