摘要
为了有效检测恶意Web网页,提出一种基于JavaScript代码基本词特征的轻量级分析方法.首先利用抓捕器获取页面中的全部源代码并从中分离出JavaScript代码,再将全部JavaScript代码用自定义的基本词表示,然后利用最近邻(K-NN)、主成分分析(PCA)和支持向量机(One-class SVM)等三种机器学习算法通过异常检测模式检测恶意网页.实验结果表明:每种算法的检测时间开销都较小,当选用PCA算法时,检测系统在1%误报率的情况下能达到90%的检测率,同时检测系统对网页的平均有效检测速率达250s-1.
In order to detect malicious Web pages efficiently ,a lightweight analysis method was pro-posed based on basic JavaScript code words .First ,crawler got all source codes from Web pages ,and then extracted JavaScript codes from source codes .Second ,self-defined basic code words replaced all JavaScript codes .Last ,three machine learning algorithms ,namely ,K-nearest neighbor (K-NN ) , principal component analysis (PCA ) as well as one-class support vector machine (SVM ) were em-ployed to detect malicious Web pages based on anomaly detection .The extensive experimental results show that our method can detect Web pages efficiently .In particular ,PCA achieves a detection rate as 90% with false positive rate of 1% ,detecting 250 s-1 .
出处
《华中科技大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2014年第11期34-38,共5页
Journal of Huazhong University of Science and Technology(Natural Science Edition)
基金
教育部高校创新团队资助项目(IRT201206)
高等学校博士学科点专项科研基金资助项目(20120009110007
20120009120010)
中央高校基本科研业务费专项资金资助项目(2012JBZ010
2013JBM025)
关键词
异常检测
恶意Web网页
主成分分析
WEB安全
机器学习
anomaly detection
malicious Web pages
principal component analysis
Web security
machine learning