期刊文献+

基于TF-IDF和随机森林算法的Web攻击流量检测方法研究 被引量:5

Research on Web Attack Traffic Detection Based on TF-IDF and Random Forest Algorithm
下载PDF
导出
摘要 随着网络技术与应用的发展,Web服务器不可避免地成为了黑客的主要攻击目标.而传统基于正则匹配的Web入侵检测系统存在规则库维护困难、特征库臃肿的问题;基于机器学习的常规检测模型也存在特征提取复杂、识别率较低的问题.针对这些问题,提出一种基于TF-IDF和随机森林构架的Web攻击流量检测模型,该模型使用TF-IDF算法构建词频矩阵,自动提取有效载荷的特征,使用随机森林算法进行分类建模,识别出正常流量与攻击流量.实验结果表明:该方法对攻击流量的检测率达到98.7%,实现了特征自动提取,简化了检测方法,适合于进行Web攻击流量的检测. With the rapid development of network and application technology, Web server became the main attack target of hackers. However, the traditional Web intrusion detection system based on regular feature matching has some problems, such as difficult maintenance of rule base and bloated feature base. Some detection models based on machine learning algorithm must also be extracted by human hands, and still the recognition rate is not high. Aiming at these problems, this paper proposed a new model to train words and characters based on TF-IDF algorithm, which combines the word frequency matrices obtained by the two training methods as feature vectors, and classifies the vector sets by using random forest algorithm to identify malicious traffic and normal traffic. From the experiments we can found that our model s detection rate reached 98.7%. And the experimental results also showed that our model can realize automatic feature extraction and simplifies the detection method. It is very suitable for detecting malicious Web traffic.
作者 祝鹏程 方勇 黄诚 刘强 Zhu Pengcheng;Fang Yong;Huang Cheng;Liu Qiang(College of Electronics and Information,Sichuan University,Chengdu 610065;College of Cybersecurity,Sichuan University,Chengdu 610207)
出处 《信息安全研究》 2018年第11期1040-1045,共6页 Journal of Information Security Research
关键词 TF-IDF 随机森林 数据范化 特征提取 Web攻击流量检测 TF-IDF random forest data normalization feature extraction Web attack traffic detection
  • 相关文献

参考文献3

二级参考文献28

  • 1OWASP(OpenWebApplicationSecurityProject),开放式Web应用程序安全项目[EB/OL].http://www.owasp.org,2013-01-31.
  • 2CWE(CommonWeaknessEnumeration),通用缺陷列表[EB/OL].http://www.applicure.com,2013-01-31.
  • 3Hall, Mark. Web application vulnerabilities on rise[M]. Computerworld, Elsevier Inc, 2007.
  • 4Heady R, Luger G, Maccabe A, et al. The architecture of a network level intrusion detection system[R]. Technical report, Computer Science Department, University of New Mexico, August 1990.
  • 5Xydas I. Host-based web anomaly intrusion detection system, an artificial immune system approach[C]//Proceedings of the lASTED International Conference on Artificial Intelligence and Applications, 2008:258-265.
  • 6Shaimaa E S, Mohamed I M, Laila M E, et al. Web Server Logs Preprocessing for Web Intrusion Detection[J].Computer and Information Science, 2011.
  • 7MA J B, YU H W, GAO C L. Study of the Issue of Personnel Promotion Based on Fuzzy Comprehensive Evaluation[C]//Information Science and Management Engineering (ISME), 2010:511-513.
  • 8LEU, Sou-Sen ; CHANG, Ching-Miao, Bayesian-network-based safety risk assessment for steel construction projects[C]//Accident; analysis and prevention, 2013:122-33.
  • 9百度百科.正则表达式[EB/OL].http://baike.baidu.com/view/94238.htm?fr=Aladdin,2014-11-13- . .
  • 10SEBUG漏洞库.文件包含漏洞[EB/OL].http://sebug.net/node/t-42,2011-12-19.

共引文献42

同被引文献32

引证文献5

二级引证文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部