Network anomaly detection plays a vital role in safeguarding network security.However,the existing network anomaly detection task is typically based on the one-class zero-positive scenario.This approach is susceptible...Network anomaly detection plays a vital role in safeguarding network security.However,the existing network anomaly detection task is typically based on the one-class zero-positive scenario.This approach is susceptible to overfitting during the training process due to discrepancies in data distribution between the training set and the test set.This phenomenon is known as prediction drift.Additionally,the rarity of anomaly data,often masked by normal data,further complicates network anomaly detection.To address these challenges,we propose the PUNet network,which ingeniously combines the strengths of traditional machine learning and deep learning techniques for anomaly detection.Specifically,PUNet employs a reconstruction-based autoencoder to pre-train normal data,enabling the network to capture potential features and correlations within the data.Subsequently,PUNet integrates a sampling algorithm to construct a pseudo-label candidate set among the outliers based on the reconstruction loss of the samples.This approach effectively mitigates the prediction drift problem by incorporating abnormal samples.Furthermore,PUNet utilizes the CatBoost classifier for anomaly detection to tackle potential data imbalance issues within the candidate set.Extensive experimental evaluations demonstrate that PUNet effectively resolves the prediction drift and data imbalance problems,significantly outperforming competing methods.展开更多
A substantial body of work has been done to identify network anomalies using supervised and unsupervised learning techniques with their unique strengths and weaknesses.In this work,we propose a new approach that takes...A substantial body of work has been done to identify network anomalies using supervised and unsupervised learning techniques with their unique strengths and weaknesses.In this work,we propose a new approach that takes advantage of both worlds of unsupervised and supervised learnings.The main objective of the proposed approach is to enable supervised anomaly detection without the provision of the associated labels by users.To this end,we estimate the labels of each connection in the training phase using clustering.The“estimated”labels are then utilized to establish a supervised learning model for the subsequent classification of connections in the testing stage.We set up a new property that defines anomalies in the context of network anomaly detection to improve the quality of estimated labels.Through our extensive experiments with a public dataset(NSL-KDD),we will prove that the proposed method can achieve performance comparable to one with the “original”labels provided in the dataset.We also introduce two heuristic functions that minimize the impact of the randomness of clustering to improve the overall quality of the estimated labels.展开更多
The increasing amount and intricacy of network traffic in the modern digital era have worsened the difficulty of identifying abnormal behaviours that may indicate potential security breaches or operational interruptio...The increasing amount and intricacy of network traffic in the modern digital era have worsened the difficulty of identifying abnormal behaviours that may indicate potential security breaches or operational interruptions. Conventional detection approaches face challenges in keeping up with the ever-changing strategies of cyber-attacks, resulting in heightened susceptibility and significant harm to network infrastructures. In order to tackle this urgent issue, this project focused on developing an effective anomaly detection system that utilizes Machine Learning technology. The suggested model utilizes contemporary machine learning algorithms and frameworks to autonomously detect deviations from typical network behaviour. It promptly identifies anomalous activities that may indicate security breaches or performance difficulties. The solution entails a multi-faceted approach encompassing data collection, preprocessing, feature engineering, model training, and evaluation. By utilizing machine learning methods, the model is trained on a wide range of datasets that include both regular and abnormal network traffic patterns. This training ensures that the model can adapt to numerous scenarios. The main priority is to ensure that the system is functional and efficient, with a particular emphasis on reducing false positives to avoid unwanted alerts. Additionally, efforts are directed on improving anomaly detection accuracy so that the model can consistently distinguish between potentially harmful and benign activity. This project aims to greatly strengthen network security by addressing emerging cyber threats and improving their resilience and reliability.展开更多
Ensemble learning for anomaly detection of data structured into a complex network has been barely studied due to the inconsistent performance of complex network characteristics and the lack of inherent objective funct...Ensemble learning for anomaly detection of data structured into a complex network has been barely studied due to the inconsistent performance of complex network characteristics and the lack of inherent objective function. We propose the intuitionistic fuzzy set(IFS)-based anomaly detection, a new two-phase ensemble method for anomaly detection based on IFS, and apply it to the abnormal behavior detection problem in temporal complex networks.Firstly, it constructs the IFS of a single network characteristic, which quantifies the degree of membership,non-membership and hesitation of each network characteristic to the defined linguistic variables so that makes the unuseful or noise characteristics become part of the detection. To build an objective intuitionistic fuzzy relationship, we propose a Gaussian distribution-based membership function which gives a variable hesitation degree. Then, for the fuzzification of multiple network characteristics, the intuitionistic fuzzy weighted geometric operator is adopted to fuse multiple IFSs and to avoid the inconsistence of multiple characteristics. Finally, the score function and precision function are used to sort the fused IFS. Finally, we carry out extensive experiments on several complex network datasets for anomaly detection, and the results demonstrate the superiority of our method to state-of-the-art approaches, validating the effectiveness of our method.展开更多
Network traffic anomalies refer to the traffic changed abnormally and obviously.Local events such as temporary network congestion,Distributed Denial of Service(DDoS)attack and large-scale scan,or global events such as...Network traffic anomalies refer to the traffic changed abnormally and obviously.Local events such as temporary network congestion,Distributed Denial of Service(DDoS)attack and large-scale scan,or global events such as abnormal network routing,can cause network anomalies.Network anomaly detection and analysis are very important to Computer Security Incident Response Teams(CSIRT).But wide-scale traffic anomaly detection requires extracting anomalous modes from large amounts of high-dimensional noise-rich data,and interpreting the modes;so,it is very difficult.This paper proposes a general method based on Principle Component Analysis(PCA)to analyze network anomalies.This method divides the traffic matrix into normal and anomalous subspaces,maps traffic vectors into the normal subspace,gets the distance from detected vector to average normal vector,and detects anomalies based on that distance.展开更多
文摘Network anomaly detection plays a vital role in safeguarding network security.However,the existing network anomaly detection task is typically based on the one-class zero-positive scenario.This approach is susceptible to overfitting during the training process due to discrepancies in data distribution between the training set and the test set.This phenomenon is known as prediction drift.Additionally,the rarity of anomaly data,often masked by normal data,further complicates network anomaly detection.To address these challenges,we propose the PUNet network,which ingeniously combines the strengths of traditional machine learning and deep learning techniques for anomaly detection.Specifically,PUNet employs a reconstruction-based autoencoder to pre-train normal data,enabling the network to capture potential features and correlations within the data.Subsequently,PUNet integrates a sampling algorithm to construct a pseudo-label candidate set among the outliers based on the reconstruction loss of the samples.This approach effectively mitigates the prediction drift problem by incorporating abnormal samples.Furthermore,PUNet utilizes the CatBoost classifier for anomaly detection to tackle potential data imbalance issues within the candidate set.Extensive experimental evaluations demonstrate that PUNet effectively resolves the prediction drift and data imbalance problems,significantly outperforming competing methods.
基金This work was supported in part by Institute of Information and Communications Technology Promotion(ITP)grant funded by the Korea government(MSIP)(No.2016-0-00078,Cloud-based Security In-telligence Technology Development for the Customized Security Service Provisioning)。
文摘A substantial body of work has been done to identify network anomalies using supervised and unsupervised learning techniques with their unique strengths and weaknesses.In this work,we propose a new approach that takes advantage of both worlds of unsupervised and supervised learnings.The main objective of the proposed approach is to enable supervised anomaly detection without the provision of the associated labels by users.To this end,we estimate the labels of each connection in the training phase using clustering.The“estimated”labels are then utilized to establish a supervised learning model for the subsequent classification of connections in the testing stage.We set up a new property that defines anomalies in the context of network anomaly detection to improve the quality of estimated labels.Through our extensive experiments with a public dataset(NSL-KDD),we will prove that the proposed method can achieve performance comparable to one with the “original”labels provided in the dataset.We also introduce two heuristic functions that minimize the impact of the randomness of clustering to improve the overall quality of the estimated labels.
文摘The increasing amount and intricacy of network traffic in the modern digital era have worsened the difficulty of identifying abnormal behaviours that may indicate potential security breaches or operational interruptions. Conventional detection approaches face challenges in keeping up with the ever-changing strategies of cyber-attacks, resulting in heightened susceptibility and significant harm to network infrastructures. In order to tackle this urgent issue, this project focused on developing an effective anomaly detection system that utilizes Machine Learning technology. The suggested model utilizes contemporary machine learning algorithms and frameworks to autonomously detect deviations from typical network behaviour. It promptly identifies anomalous activities that may indicate security breaches or performance difficulties. The solution entails a multi-faceted approach encompassing data collection, preprocessing, feature engineering, model training, and evaluation. By utilizing machine learning methods, the model is trained on a wide range of datasets that include both regular and abnormal network traffic patterns. This training ensures that the model can adapt to numerous scenarios. The main priority is to ensure that the system is functional and efficient, with a particular emphasis on reducing false positives to avoid unwanted alerts. Additionally, efforts are directed on improving anomaly detection accuracy so that the model can consistently distinguish between potentially harmful and benign activity. This project aims to greatly strengthen network security by addressing emerging cyber threats and improving their resilience and reliability.
基金Supported by the National Natural Science Foundation of China under Grant No 61671142the Fundamental Research Funds for the Central Universities under Grant No 02190022117021
文摘Ensemble learning for anomaly detection of data structured into a complex network has been barely studied due to the inconsistent performance of complex network characteristics and the lack of inherent objective function. We propose the intuitionistic fuzzy set(IFS)-based anomaly detection, a new two-phase ensemble method for anomaly detection based on IFS, and apply it to the abnormal behavior detection problem in temporal complex networks.Firstly, it constructs the IFS of a single network characteristic, which quantifies the degree of membership,non-membership and hesitation of each network characteristic to the defined linguistic variables so that makes the unuseful or noise characteristics become part of the detection. To build an objective intuitionistic fuzzy relationship, we propose a Gaussian distribution-based membership function which gives a variable hesitation degree. Then, for the fuzzification of multiple network characteristics, the intuitionistic fuzzy weighted geometric operator is adopted to fuse multiple IFSs and to avoid the inconsistence of multiple characteristics. Finally, the score function and precision function are used to sort the fused IFS. Finally, we carry out extensive experiments on several complex network datasets for anomaly detection, and the results demonstrate the superiority of our method to state-of-the-art approaches, validating the effectiveness of our method.
基金This work was funded by the High-tech Research and Development Program of China (863 Program) under Grant 2006II01Z451.
文摘Network traffic anomalies refer to the traffic changed abnormally and obviously.Local events such as temporary network congestion,Distributed Denial of Service(DDoS)attack and large-scale scan,or global events such as abnormal network routing,can cause network anomalies.Network anomaly detection and analysis are very important to Computer Security Incident Response Teams(CSIRT).But wide-scale traffic anomaly detection requires extracting anomalous modes from large amounts of high-dimensional noise-rich data,and interpreting the modes;so,it is very difficult.This paper proposes a general method based on Principle Component Analysis(PCA)to analyze network anomalies.This method divides the traffic matrix into normal and anomalous subspaces,maps traffic vectors into the normal subspace,gets the distance from detected vector to average normal vector,and detects anomalies based on that distance.