摘要
钓鱼网站一直是网络安全中需要解决的难题之一,它的隐蔽性很高,但造成的损失往往很大.针对钓鱼网站的研究,有很多学者通过机器学习算法对钓鱼网站和正常网站进行分类.根据在钓鱼网站检测中常用的分类算法(KNN;SVM;贝叶斯)为基础,通过对网站的URL特征和页面内容特征进行实验比较研究.实验结果表明,在URL特征和页面内容特征上,线性SVM分类器的准确率和召回率都高于KNN算法和多项式的朴素贝叶斯算法.
Phishing has always been one of the difficult problems to be solved in network security.It has high concealment,but the loss caused by it is often very large.For the research of phishing,many scholars classify phishing and normal websites by machine learning algorithm.This paper is mainly based on the classification algorithm(KNN;SVM;Bayes)commonly used in phishing detecting and though the URL characteristics and page content characteristics of the website are compared and studied experimentally.Experimental results show that the accuracy and recall rate of linear SVM classifier are higher than KNN algorithm and polynomial Native Bayes algorithm in URL feature and page content feature.
作者
王文腾
王传涛
袭薇
佟晖
WANG Wenteng;WANG Chuantao;XI Wei;TONG Hui(School of Mechanical-Electronic and Vehicle Engineering,Beijing University of Civil Engineering and Architecture,Beijing100044;Beijing Public Security Bureau,Beijing100740;Shanghai Key Laboratory of Information Security Integrated Management Technology,Shanghai200240)
出处
《北京建筑大学学报》
2019年第1期76-81,共6页
Journal of Beijing University of Civil Engineering and Architecture
基金
国家自然科学基金项目(11774380)