In order to effectively detect malicious phishing behaviors, a phishing detection method based on the uniform resource locator (URL) features is proposed. First, the method compares the phishing URLs with legal ones...In order to effectively detect malicious phishing behaviors, a phishing detection method based on the uniform resource locator (URL) features is proposed. First, the method compares the phishing URLs with legal ones to extract the features of phishing URLs. Then a machine learning algorithm is applied to obtain the URL classification model from the sample data set training. In order to adapt to the change of a phishing URL, the classification model should be constantly updated according to the new samples. So, an incremental learning algorithm based on the feedback of the original sample data set is designed. The experiments verify that the combination of the URL features extracted in this paper and the support vector machine (SVM) classification algorithm can achieve a high phishing detection accuracy, and the incremental learning algorithm is also effective.展开更多
基金The National Basic Research Program of China(973 Program)(No.2010CB328104,2009CB320501)the National Natural Science Foundation of China(No.61272531,61070158,61003257,61060161,61003311,41201486)+4 种基金the National Key Technology R&D Program during the11th Five-Year Plan Period(No.2010BAI88B03)Specialized Research Fund for the Doctoral Program of Higher Education(No.20110092130002)the National Science and Technology Major Project(No.2009ZX03004-004-04)the Foundation of the Key Laboratory of Netw ork and Information Security of Jiangsu Province(No.BM2003201)the Key Laboratory of Computer Netw ork and Information Integration of the Ministry of Education of China(No.93K-9)
文摘In order to effectively detect malicious phishing behaviors, a phishing detection method based on the uniform resource locator (URL) features is proposed. First, the method compares the phishing URLs with legal ones to extract the features of phishing URLs. Then a machine learning algorithm is applied to obtain the URL classification model from the sample data set training. In order to adapt to the change of a phishing URL, the classification model should be constantly updated according to the new samples. So, an incremental learning algorithm based on the feedback of the original sample data set is designed. The experiments verify that the combination of the URL features extracted in this paper and the support vector machine (SVM) classification algorithm can achieve a high phishing detection accuracy, and the incremental learning algorithm is also effective.