While encryption technology safeguards the security of network communications,malicious traffic also uses encryption protocols to obscure its malicious behavior.To address the issues of traditional machine learning me...While encryption technology safeguards the security of network communications,malicious traffic also uses encryption protocols to obscure its malicious behavior.To address the issues of traditional machine learning methods relying on expert experience and the insufficient representation capabilities of existing deep learning methods for encrypted malicious traffic,we propose an encrypted malicious traffic classification method that integrates global semantic features with local spatiotemporal features,called BERT-based Spatio-Temporal Features Network(BSTFNet).At the packet-level granularity,the model captures the global semantic features of packets through the attention mechanism of the Bidirectional Encoder Representations from Transformers(BERT)model.At the byte-level granularity,we initially employ the Bidirectional Gated Recurrent Unit(BiGRU)model to extract temporal features from bytes,followed by the utilization of the Text Convolutional Neural Network(TextCNN)model with multi-sized convolution kernels to extract local multi-receptive field spatial features.The fusion of features from both granularities serves as the ultimate multidimensional representation of malicious traffic.Our approach achieves accuracy and F1-score of 99.39%and 99.40%,respectively,on the publicly available USTC-TFC2016 dataset,and effectively reduces sample confusion within the Neris and Virut categories.The experimental results demonstrate that our method has outstanding representation and classification capabilities for encrypted malicious traffic.展开更多
With the advancement of wireless network technology,vast amounts of traffic have been generated,and malicious traffic attacks that threaten the network environment are becoming increasingly sophisticated.While signatu...With the advancement of wireless network technology,vast amounts of traffic have been generated,and malicious traffic attacks that threaten the network environment are becoming increasingly sophisticated.While signature-based detection methods,static analysis,and dynamic analysis techniques have been previously explored for malicious traffic detection,they have limitations in identifying diversified malware traffic patterns.Recent research has been focused on the application of machine learning to detect these patterns.However,applying machine learning to lightweight devices like IoT devices is challenging because of the high computational demands and complexity involved in the learning process.In this study,we examined methods for effectively utilizing machine learning-based malicious traffic detection approaches for lightweight devices.We introduced the suboptimal feature selection model(SFSM),a feature selection technique designed to reduce complexity while maintaining the effectiveness of malicious traffic detection.Detection performance was evaluated on various malicious traffic,benign,exploits,and generic,using the UNSW-NB15 dataset and SFSM sub-optimized hyperparameters for feature selection and narrowed the search scope to encompass all features.SFSM improved learning performance while minimizing complexity by considering feature selection and exhaustive search as two steps,a problem not considered in conventional models.Our experimental results showed that the detection accuracy was improved by approximately 20%compared to the random model,and the reduction in accuracy compared to the greedy model,which performs an exhaustive search on all features,was kept within 6%.Additionally,latency and complexity were reduced by approximately 96%and 99.78%,respectively,compared to the greedy model.This study demonstrates that malicious traffic can be effectively detected even in lightweight device environments.SFSM verified the possibility of detecting various attack traffic on lightweight devices.展开更多
In the upcoming large-scale Internet of Things(Io T),it is increasingly challenging to defend against malicious traffic,due to the heterogeneity of Io T devices and the diversity of Io T communication protocols.In thi...In the upcoming large-scale Internet of Things(Io T),it is increasingly challenging to defend against malicious traffic,due to the heterogeneity of Io T devices and the diversity of Io T communication protocols.In this paper,we propose a semi-supervised learning-based approach to detect malicious traffic at the access side.It overcomes the resource-bottleneck problem of traditional malicious traffic defenders which are deployed at the victim side,and also is free of labeled traffic data in model training.Specifically,we design a coarse-grained behavior model of Io T devices by self-supervised learning with unlabeled traffic data.Then,we fine-tune this model to improve its accuracy in malicious traffic detection by adopting a transfer learning method using a small amount of labeled data.Experimental results show that our method can achieve the accuracy of 99.52%and the F1-score of 99.52%with only 1%of the labeled training data based on the CICDDoS2019 dataset.Moreover,our method outperforms the stateof-the-art supervised learning-based methods in terms of accuracy,precision,recall and F1-score with 1%of the training data.展开更多
Malicious traffic detection over the internet is one of the challenging areas for researchers to protect network infrastructures from any malicious activity.Several shortcomings of a network system can be leveraged by...Malicious traffic detection over the internet is one of the challenging areas for researchers to protect network infrastructures from any malicious activity.Several shortcomings of a network system can be leveraged by an attacker to get unauthorized access through malicious traffic.Safeguard from such attacks requires an efficient automatic system that can detect malicious traffic timely and avoid system damage.Currently,many automated systems can detect malicious activity,however,the efficacy and accuracy need further improvement to detect malicious traffic from multi-domain systems.The present study focuses on the detection of malicious traffic with high accuracy using machine learning techniques.The proposed approach used two datasets UNSW-NB15 and IoTID20 which contain the data for IoT-based traffic and local network traffic,respectively.Both datasets were combined to increase the capability of the proposed approach in detecting malicious traffic from local and IoT networks,with high accuracy.Horizontally merging both datasets requires an equal number of features which was achieved by reducing feature count to 30 for each dataset by leveraging principal component analysis(PCA).The proposed model incorporates stacked ensemble model extra boosting forest(EBF)which is a combination of tree-based models such as extra tree classifier,gradient boosting classifier,and random forest using a stacked ensemble approach.Empirical results show that EBF performed significantly better and achieved the highest accuracy score of 0.985 and 0.984 on the multi-domain dataset for two and four classes,respectively.展开更多
随着物联网技术的广泛应用,针对物联网设备计算和存储能力受限的特性,设计一种高精度、轻量化的恶意流量识别方法,对于保障物联网设备的安全具有重要意义.本文提出一种基于会话中数据包的灰度图片转换方法(Packets in a Session to Gray...随着物联网技术的广泛应用,针对物联网设备计算和存储能力受限的特性,设计一种高精度、轻量化的恶意流量识别方法,对于保障物联网设备的安全具有重要意义.本文提出一种基于会话中数据包的灰度图片转换方法(Packets in a Session to Grayscale Image,PS2GI)用来生成以原始流量数据构建的灰度图片,同时提出一种基于简化混合VisionTransformer(Simplified Hybrid Vision Transformer,SHViT)深度学习模型中的注意力机制的方式用来实现高精度、轻量化的恶意流量识别方法.实验结果表明,使用SHViT模型在IoT-23数据集上对比ViT模型在多分类情况的准确率降低0.17%,达到99.70%,模型的推理时间增加33.8%,达到6.37ms,但是模型的参数量降低68.1%,达到3.06M,同时模型的计算量降低41.7%.展开更多
为解决物联网设备资源受限、平衡流量检测精度与时间开销等问题,提出一种FastSplit-RF(random forest with fast split)的轻量化分类算法。针对物联网流量设计一个通用的特征提取流程,在随机森林算法基础上,使用多臂赌博机策略代替节点...为解决物联网设备资源受限、平衡流量检测精度与时间开销等问题,提出一种FastSplit-RF(random forest with fast split)的轻量化分类算法。针对物联网流量设计一个通用的特征提取流程,在随机森林算法基础上,使用多臂赌博机策略代替节点分裂的遍历过程,实现对节点的快速分割,完成高效、轻量化的物联网流量分类。实验验证,FastSplit-RF相较随机森林算法,在准确率提升了2.45%的同时,检测速度增快了62.16%,内存占用减小了48.68%。展开更多
基金This research was funded by National Natural Science Foundation of China under Grant No.61806171Sichuan University of Science&Engineering Talent Project under Grant No.2021RC15+2 种基金Open Fund Project of Key Laboratory for Non-Destructive Testing and Engineering Computer of Sichuan Province Universities on Bridge Inspection and Engineering under Grant No.2022QYJ06Sichuan University of Science&Engineering Graduate Student Innovation Fund under Grant No.Y2023115The Scientific Research and Innovation Team Program of Sichuan University of Science and Technology under Grant No.SUSE652A006.
文摘While encryption technology safeguards the security of network communications,malicious traffic also uses encryption protocols to obscure its malicious behavior.To address the issues of traditional machine learning methods relying on expert experience and the insufficient representation capabilities of existing deep learning methods for encrypted malicious traffic,we propose an encrypted malicious traffic classification method that integrates global semantic features with local spatiotemporal features,called BERT-based Spatio-Temporal Features Network(BSTFNet).At the packet-level granularity,the model captures the global semantic features of packets through the attention mechanism of the Bidirectional Encoder Representations from Transformers(BERT)model.At the byte-level granularity,we initially employ the Bidirectional Gated Recurrent Unit(BiGRU)model to extract temporal features from bytes,followed by the utilization of the Text Convolutional Neural Network(TextCNN)model with multi-sized convolution kernels to extract local multi-receptive field spatial features.The fusion of features from both granularities serves as the ultimate multidimensional representation of malicious traffic.Our approach achieves accuracy and F1-score of 99.39%and 99.40%,respectively,on the publicly available USTC-TFC2016 dataset,and effectively reduces sample confusion within the Neris and Virut categories.The experimental results demonstrate that our method has outstanding representation and classification capabilities for encrypted malicious traffic.
基金supported by the Korea Institute for Advancement of Technology(KIAT)Grant funded by the Korean Government(MOTIE)(P0008703,The Competency Development Program for Industry Specialists)MSIT under the ICAN(ICT Challenge and Advanced Network of HRD)Program(No.IITP-2022-RS-2022-00156310)supervised by the Institute of Information&Communication Technology Planning and Evaluation(IITP).
文摘With the advancement of wireless network technology,vast amounts of traffic have been generated,and malicious traffic attacks that threaten the network environment are becoming increasingly sophisticated.While signature-based detection methods,static analysis,and dynamic analysis techniques have been previously explored for malicious traffic detection,they have limitations in identifying diversified malware traffic patterns.Recent research has been focused on the application of machine learning to detect these patterns.However,applying machine learning to lightweight devices like IoT devices is challenging because of the high computational demands and complexity involved in the learning process.In this study,we examined methods for effectively utilizing machine learning-based malicious traffic detection approaches for lightweight devices.We introduced the suboptimal feature selection model(SFSM),a feature selection technique designed to reduce complexity while maintaining the effectiveness of malicious traffic detection.Detection performance was evaluated on various malicious traffic,benign,exploits,and generic,using the UNSW-NB15 dataset and SFSM sub-optimized hyperparameters for feature selection and narrowed the search scope to encompass all features.SFSM improved learning performance while minimizing complexity by considering feature selection and exhaustive search as two steps,a problem not considered in conventional models.Our experimental results showed that the detection accuracy was improved by approximately 20%compared to the random model,and the reduction in accuracy compared to the greedy model,which performs an exhaustive search on all features,was kept within 6%.Additionally,latency and complexity were reduced by approximately 96%and 99.78%,respectively,compared to the greedy model.This study demonstrates that malicious traffic can be effectively detected even in lightweight device environments.SFSM verified the possibility of detecting various attack traffic on lightweight devices.
基金supported in part by the National Key R&D Program of China under Grant 2018YFA0701601part by the National Natural Science Foundation of China(Grant No.U22A2002,61941104,62201605)part by Tsinghua University-China Mobile Communications Group Co.,Ltd.Joint Institute。
文摘In the upcoming large-scale Internet of Things(Io T),it is increasingly challenging to defend against malicious traffic,due to the heterogeneity of Io T devices and the diversity of Io T communication protocols.In this paper,we propose a semi-supervised learning-based approach to detect malicious traffic at the access side.It overcomes the resource-bottleneck problem of traditional malicious traffic defenders which are deployed at the victim side,and also is free of labeled traffic data in model training.Specifically,we design a coarse-grained behavior model of Io T devices by self-supervised learning with unlabeled traffic data.Then,we fine-tune this model to improve its accuracy in malicious traffic detection by adopting a transfer learning method using a small amount of labeled data.Experimental results show that our method can achieve the accuracy of 99.52%and the F1-score of 99.52%with only 1%of the labeled training data based on the CICDDoS2019 dataset.Moreover,our method outperforms the stateof-the-art supervised learning-based methods in terms of accuracy,precision,recall and F1-score with 1%of the training data.
文摘Malicious traffic detection over the internet is one of the challenging areas for researchers to protect network infrastructures from any malicious activity.Several shortcomings of a network system can be leveraged by an attacker to get unauthorized access through malicious traffic.Safeguard from such attacks requires an efficient automatic system that can detect malicious traffic timely and avoid system damage.Currently,many automated systems can detect malicious activity,however,the efficacy and accuracy need further improvement to detect malicious traffic from multi-domain systems.The present study focuses on the detection of malicious traffic with high accuracy using machine learning techniques.The proposed approach used two datasets UNSW-NB15 and IoTID20 which contain the data for IoT-based traffic and local network traffic,respectively.Both datasets were combined to increase the capability of the proposed approach in detecting malicious traffic from local and IoT networks,with high accuracy.Horizontally merging both datasets requires an equal number of features which was achieved by reducing feature count to 30 for each dataset by leveraging principal component analysis(PCA).The proposed model incorporates stacked ensemble model extra boosting forest(EBF)which is a combination of tree-based models such as extra tree classifier,gradient boosting classifier,and random forest using a stacked ensemble approach.Empirical results show that EBF performed significantly better and achieved the highest accuracy score of 0.985 and 0.984 on the multi-domain dataset for two and four classes,respectively.
文摘随着物联网技术的广泛应用,针对物联网设备计算和存储能力受限的特性,设计一种高精度、轻量化的恶意流量识别方法,对于保障物联网设备的安全具有重要意义.本文提出一种基于会话中数据包的灰度图片转换方法(Packets in a Session to Grayscale Image,PS2GI)用来生成以原始流量数据构建的灰度图片,同时提出一种基于简化混合VisionTransformer(Simplified Hybrid Vision Transformer,SHViT)深度学习模型中的注意力机制的方式用来实现高精度、轻量化的恶意流量识别方法.实验结果表明,使用SHViT模型在IoT-23数据集上对比ViT模型在多分类情况的准确率降低0.17%,达到99.70%,模型的推理时间增加33.8%,达到6.37ms,但是模型的参数量降低68.1%,达到3.06M,同时模型的计算量降低41.7%.
文摘为解决物联网设备资源受限、平衡流量检测精度与时间开销等问题,提出一种FastSplit-RF(random forest with fast split)的轻量化分类算法。针对物联网流量设计一个通用的特征提取流程,在随机森林算法基础上,使用多臂赌博机策略代替节点分裂的遍历过程,实现对节点的快速分割,完成高效、轻量化的物联网流量分类。实验验证,FastSplit-RF相较随机森林算法,在准确率提升了2.45%的同时,检测速度增快了62.16%,内存占用减小了48.68%。