摘要
针对子域名冲突给当前DGA检测算法带来精度降低的问题,本文首先研究了域名长度分布与冲突发生的概率关系,并提出了针对不同长度下域名冲突的解决方法.在模型方面,针对DGA域名检测,提出了一种基于改进eXpose和融合多头自注意力的BiLSTM方法.所提方法融合了改进的eXpose卷积网络和带有多头注意力的BiLSTM网络,在原有eXpose模型的基础上引入深度分支,使得模型在特征提取的尺度上更加全面,同时可通过BiLSTM模型提取域名序列双向的上下文信息.公开数据集上的对比实验验证了本文对超短子域名提取完整域名方法的有效性,在准确率和其他评价指标上较现有方法均得到有效提升.
Aiming at the problem that subdomain name conflicts affect the accuracy of the current DGA detection algorithm,this paper first studies the relationship between domain name length distribution and the probability of conflicts,and proposes a solution to domain name conflicts under different lengths.In terms of the model,for DGA domain detection,a method based on an improved eXpose and a BiLSTM model with integrated multi-head self-attention is proposed.The proposed method combines the improved eXpose convolutional network and the BiLSTM network with multi-head attention,and introduces a deep branch on the basis of the original eXpose model,making the model more comprehensive in the scale of feature extraction.At the same time,the domain name can be extracted through the BiLSTM model sequence bidirectional contextual information.The comparative experiment on the public data set verifies the effectiveness of this method for extracting complete domain names from ultra-short subdomain names.Compared with the existing methods,the detection accuracy and other evaluation indicators of the model in this paper have been effectively improved.
作者
荚东升
翟江涛
周桥
孙浩翔
JIA Dong-sheng;ZHAI Jiang-tao;ZHOU Qiao;SUN Hao-xiang(School of Electronic&Information Engineering,Nanjing University of Information Science&Technology,Nanjing 210044,China)
出处
《东北师大学报(自然科学版)》
CAS
北大核心
2024年第3期53-61,共9页
Journal of Northeast Normal University(Natural Science Edition)
基金
国家自然科学基金资助项目(61931004,62072250)
国家重点研发计划项目(2021QY0700).