While encryption technology safeguards the security of network communications,malicious traffic also uses encryption protocols to obscure its malicious behavior.To address the issues of traditional machine learning me...While encryption technology safeguards the security of network communications,malicious traffic also uses encryption protocols to obscure its malicious behavior.To address the issues of traditional machine learning methods relying on expert experience and the insufficient representation capabilities of existing deep learning methods for encrypted malicious traffic,we propose an encrypted malicious traffic classification method that integrates global semantic features with local spatiotemporal features,called BERT-based Spatio-Temporal Features Network(BSTFNet).At the packet-level granularity,the model captures the global semantic features of packets through the attention mechanism of the Bidirectional Encoder Representations from Transformers(BERT)model.At the byte-level granularity,we initially employ the Bidirectional Gated Recurrent Unit(BiGRU)model to extract temporal features from bytes,followed by the utilization of the Text Convolutional Neural Network(TextCNN)model with multi-sized convolution kernels to extract local multi-receptive field spatial features.The fusion of features from both granularities serves as the ultimate multidimensional representation of malicious traffic.Our approach achieves accuracy and F1-score of 99.39%and 99.40%,respectively,on the publicly available USTC-TFC2016 dataset,and effectively reduces sample confusion within the Neris and Virut categories.The experimental results demonstrate that our method has outstanding representation and classification capabilities for encrypted malicious traffic.展开更多
Without explicit description of map application themes,it is difficult for users to discover desired map resources from massive online Web Map Services(WMS).However,metadata-based map application theme extraction is a...Without explicit description of map application themes,it is difficult for users to discover desired map resources from massive online Web Map Services(WMS).However,metadata-based map application theme extraction is a challenging multi-label text classification task due to limited training samples,mixed vocabularies,variable length and content arbitrariness of text fields.In this paper,we propose a novel multi-label text classification method,Text GCN-SW-KNN,based on geographic semantics and collaborative training to improve classifica-tion accuracy.The semi-supervised collaborative training adopts two base models,i.e.a modified Text Graph Convolutional Network(Text GCN)by utilizing Semantic Web,named Text GCN-SW,and widely-used Multi-Label K-Nearest Neighbor(ML-KNN).Text GCN-SW is improved from Text GCN by adjusting the adjacency matrix of the heterogeneous word document graph with the shortest semantic distances between themes and words in metadata text.The distances are calculated with the Semantic Web of Earth and Environmental Terminology(SWEET)and WordNet dictionaries.Experiments on both the WMS and layer metadata show that the proposed methods can achieve higher F1-score and accuracy than state-of-the-art baselines,and demonstrate better stability in repeating experiments and robustness to less training data.Text GCN-SW-KNN can be extended to other multi-label text classification scenario for better supporting metadata enhancement and geospatial resource discovery in Earth Science domain.展开更多
基金This research was funded by National Natural Science Foundation of China under Grant No.61806171Sichuan University of Science&Engineering Talent Project under Grant No.2021RC15+2 种基金Open Fund Project of Key Laboratory for Non-Destructive Testing and Engineering Computer of Sichuan Province Universities on Bridge Inspection and Engineering under Grant No.2022QYJ06Sichuan University of Science&Engineering Graduate Student Innovation Fund under Grant No.Y2023115The Scientific Research and Innovation Team Program of Sichuan University of Science and Technology under Grant No.SUSE652A006.
文摘While encryption technology safeguards the security of network communications,malicious traffic also uses encryption protocols to obscure its malicious behavior.To address the issues of traditional machine learning methods relying on expert experience and the insufficient representation capabilities of existing deep learning methods for encrypted malicious traffic,we propose an encrypted malicious traffic classification method that integrates global semantic features with local spatiotemporal features,called BERT-based Spatio-Temporal Features Network(BSTFNet).At the packet-level granularity,the model captures the global semantic features of packets through the attention mechanism of the Bidirectional Encoder Representations from Transformers(BERT)model.At the byte-level granularity,we initially employ the Bidirectional Gated Recurrent Unit(BiGRU)model to extract temporal features from bytes,followed by the utilization of the Text Convolutional Neural Network(TextCNN)model with multi-sized convolution kernels to extract local multi-receptive field spatial features.The fusion of features from both granularities serves as the ultimate multidimensional representation of malicious traffic.Our approach achieves accuracy and F1-score of 99.39%and 99.40%,respectively,on the publicly available USTC-TFC2016 dataset,and effectively reduces sample confusion within the Neris and Virut categories.The experimental results demonstrate that our method has outstanding representation and classification capabilities for encrypted malicious traffic.
基金supported by National Natural Science Foundation of China[No.41971349,No.41930107,No.42090010 and No.41501434]National Key Research and Development Program of China[No.2017YFB0503704 and No.2018YFC0809806].
文摘Without explicit description of map application themes,it is difficult for users to discover desired map resources from massive online Web Map Services(WMS).However,metadata-based map application theme extraction is a challenging multi-label text classification task due to limited training samples,mixed vocabularies,variable length and content arbitrariness of text fields.In this paper,we propose a novel multi-label text classification method,Text GCN-SW-KNN,based on geographic semantics and collaborative training to improve classifica-tion accuracy.The semi-supervised collaborative training adopts two base models,i.e.a modified Text Graph Convolutional Network(Text GCN)by utilizing Semantic Web,named Text GCN-SW,and widely-used Multi-Label K-Nearest Neighbor(ML-KNN).Text GCN-SW is improved from Text GCN by adjusting the adjacency matrix of the heterogeneous word document graph with the shortest semantic distances between themes and words in metadata text.The distances are calculated with the Semantic Web of Earth and Environmental Terminology(SWEET)and WordNet dictionaries.Experiments on both the WMS and layer metadata show that the proposed methods can achieve higher F1-score and accuracy than state-of-the-art baselines,and demonstrate better stability in repeating experiments and robustness to less training data.Text GCN-SW-KNN can be extended to other multi-label text classification scenario for better supporting metadata enhancement and geospatial resource discovery in Earth Science domain.