摘要
针对恶意仿冒URL的有效识别问题,提出一种基于skip-gram和连续多层卷积层的模型相结合的网络模型完成对恶意仿冒URL进行特征提取并检测。根据URL结构特性将其切分为5个部分,使用skip-gram对字符进行稠密编码将URL数据信息进行转化;使用连续多个卷积层的CNN模型针对URL的每个部分完成独立特征提取,将特征提取结果进行整合;使用贝叶斯、随机森林等多种分类器对模型提取特征空间进行评估。实验结果表明,该方法能够快速有效地对恶意仿冒URL进行检测,检测准确率可达97%,效果优于典型的eXpose多核卷积模型。
To effectively identify malicious counterfeit URLs,a network model based on skip-gram and continuous multi-layer convolution layer was proposed to extract and detect malicious counterfeit URLs.The URL was divided into five parts according to its structural characteristics,and skip-gram was used to densely encode the characters to transform the URL data information.The CNN model of successive convolution layers was used to extract independent features for each part of the URL,and the results of feature extraction were integrated.Bayesian and random forest classifiers were used to evaluate the feature space extracted from the model.Experimental results show that the proposed method can detect malicious counterfeit URLs quickly and effectively,and the detection accuracy can reach 97%,which is better than the typical eXpose multi-core convolution model.
作者
张婷
钱丽萍
汪立东
张慧
ZHANG Ting;QIAN Li-ping;WANG Li-dong;ZHANG Hui(College of Electrical and Information Engineering,Beijing University of Civil Engineering and Architecture,Beijing 100044,China;National Computer Network Emergency Response Technical Team/Coordination Center of China,Beijing 100029,China)
出处
《计算机工程与设计》
北大核心
2020年第7期1821-1828,共8页
Computer Engineering and Design
基金
国家自然科学基金项目(61571144)
北京建筑大学博士基金项目(00331616014)
北京建筑大学研究生创新基金项目(PG2019069)。
关键词
恶意仿冒URL
卷积神经网络
字符嵌入
特征提取
深度学习
malicious counterfeit URLs
convolutional neural network
character embedding
feature extraction
deep learning