期刊文献+

基于欠采样和多层集成学习的恶意网页识别

Malicious web page recognition based on undersampling and multi-layer ensemble learning
下载PDF
导出
摘要 现实中恶意网页与良性网页比重严重失衡,传统的机器学习分类模型不能很好的应用,为此提出一种基于欠采样和多层集成学习的恶意网页检测模型。通过欠采样达到局部数据平衡;通过第一层基于权重和阈值的集成学习确保模型的准确度;通过第二层基于投票的集成学习保证全局信息的完整性。实验结果表明,所提模型在不平衡数据集上的恶意网页识别性能优于传统机器学习模型。 According to the serious imbalance between malicious web pages and benign web pages in reality,the traditional machine learning classification model can not be well applied.To solve the problem,a malicious web page detection model based on undersampling and multi-layer ensemble learning was proposed.Local data balance was achieved by undersampling.The accuracy of the model was ensured through the first layer of integrated learning based on weights and thresholds.The integrity of global information was ensured through the second layer of voting based integrated learning.Experimental results show that the proposed model outperforms the traditional machine learning model in identifying malicious web pages on unbalanced data sets.
作者 王法玉 于晓文 陈洪涛 WANG Fa-yu;YU Xiao-wen;CHEN Hong-tao(National Engineering Laboratory of Computer Virus Prevention and Control Technology,Tianjin University of Technology,Tianjin 300384,China)
出处 《计算机工程与设计》 北大核心 2024年第3期669-675,共7页 Computer Engineering and Design
基金 国家自然科学基金项目(61571328)。
关键词 恶意网页识别 不平衡数据 多层分类器 欠采样 机器学习 集成学习 检测效果 malicious web page identification unbalanced data multilayer classifier under sampling machine learning integrated learning detection effect
  • 相关文献

参考文献8

二级参考文献41

共引文献86

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部