摘要
为了降低恶意网页脚本检测存在的漏检和误检问题,提高检测的查准率和查全率,笔者提出基于机器学习的恶意网页脚本检测方法。首先,采用网络爬虫爬取网站中存在的正常网络脚本和恶意网络脚本信息,作为机器学习算法迭代训练样本数据集;其次,引入N-gram特征模型提取网页脚本潜在特征;最后,使用机器学习中的神经网络算法迭代训练样本数据集,当网络训练误差达到最小时,在网络输入层输入恶意网页脚本,通过迭代训练得到恶意网页脚本检测结果。实验结果表明,当N-gram特征的N=4且样本特征数量值取700时,利用研究方法检测恶意网页脚本的漏检率和误检率低于1%,查全率和查准率均高于95%。
In order to reduce the missed detection and false detection of malicious web scripts, and improve the accuracy and recall rate of detection, a malicious web script detection method based on machine learning is proposed. Firstly, the normal network script and malicious network script information in the website are crawled by the web crawler as the iterative training sample data set of the machine learning algorithm;Secondly, n-gram feature model is introduced to extract the potential features of Web script;Finally, the neural network algorithm in machine learning is used to iterate the training sample data set. When the network training error reaches the minimum, the malicious web page script is input in the network input layer, and the detection results of malicious web page script are obtained through iterative training. The experimental results show that when the n-gram feature n = 4 and the number of sample features is 700, the missed detection rate and false detection rate of malicious web script detected by the research method are less than 1%, and the recall rate and precision rate are higher than 95%.
作者
余飞
陈乾
刘峻源
YU Fei;CHEN Qian;LIU Junyuan(Guang'an Vocational&Technical College,Guang'an Sichuan 638000,China;Guang'an Public Security Bureau Cyber SecurityDefense Detachmen,Guang'an Sichuan 638000,China)
出处
《信息与电脑》
2022年第2期64-66,共3页
Information & Computer