期刊文献+

基于改进随机森林算法的防火墙日志异常检测并行化方法 被引量:1

Parallel implementation method of firewall log anomaly detection based on improved random forest
下载PDF
导出
摘要 随机森林分类算法在产生决策树以及投票流程中各个决策树的分类准确度各不相同,由此带来的问题是少部分决策树会影响随机森林算法的整体分类性能。除此以外,数据集中的不平衡数据也能影响到决策树的分类精度。针对以上缺点,对Bootstrap抽样方法添加约束条件,以降低非平衡数据对生成决策树的影响;以及利用袋外数据(Outof-Bagging)和非平衡系数对生成的决策树进行评估加权。试验结果表明,所提算法改善了随机森林对不平衡数据的分类精度。 The classification accuracy of the random forest classification algorithm is different in the decision tree generation and voting process.The problem is that a small number of decision trees will affect the overall classification performance of the random forest algorithm.In addition,the unbalanced data in the dataset can also affect the classification accuracy of the decision tree.In view of the above shortcomings,add constraints to the Bootstrap sampling method to reduce the impact of unbalanced data on the generation of decision trees;And use out of bag data(Out of Bagging)and unbalanced coefficients to evaluate and weight the generated decision tree.The experimental results show that the proposed algorithm improves the classification accuracy of random forests for unbalanced data.
作者 刘成 王佳斌 洪继炜 Liu Cheng;Wang Jiabin;Hong Jiwei(College of Engineering,Huaqiao university,Quanzhou 362021,China)
机构地区 华侨大学工学院
出处 《现代计算机》 2023年第14期66-69,共4页 Modern Computer
关键词 SPARK 随机森林算法 入侵检测 日志异常检测 spark random forest intrusion detection log anomaly detection
  • 相关文献

参考文献1

二级参考文献17

  • 1Moore AW, Zuev D. Internet traffic classification using Bayesian analysis techniques. In: Proc. of the 2005 ACM SIGMETRICS Int'l Conf. on Measurement and Modeling of Computer Systems, Banff, 2005. 50-60. http://www.cl.cam.ac.uk/-awm22 /publications/moore2005internet.pdf.
  • 2Madhukar A, Williamson C. A longitudinal study of P2P traffic classification. In: Proc. of the 14th IEEE Int'l Syrup. on Modeling, Analysis, and Simulation. Monterey, 2006. http://ieeexplore.ieee.org/xpl/ffeeabs_all.jsp?arnumber=1698549.
  • 3Moore AW, Papagiannaki K. Toward the accurate identification of network applications. In: Dovrolis C, ed. Proc. of the PAM 2005. LNCS 3431, Heidelberg: Springer-Verlag, 2005.41-54.
  • 4Karagiannis T, Papagiannaki K, Faloutsos M. BLINC: Multilevel traffic classification in the dark. In: Proc. of the ACM SIGCOMM. Philadelphia, 2005. 229-240. http://conferences.sigcomm.org/sigcomm/2005/paper-KarPap.pdf.
  • 5Roughan M, Sen S, Spatscheck O, Dutfield N. Class-of-Service mapping for QoS: A statistical signature-based approach to IP traffic classification. In: Proc. of the ACM SIGCOMM Internet Measurement Conf. Taormina, 2004. 135-148. http://www.imconf.net/imc-2004/papers/p 135-roughan.pdf.
  • 6Zuev D, Moore AW. Traffic classification using a statistical approach. In: Dovrolis C, ed. Proc. of the PAM 2005. LNCS 3431, Heidelberg: Springer-Verlag, 2005. 321-324.
  • 7Nguyen T, Armitage G. Training on multiple sub-flows to optimise the use of Machine Learning classifiers in real-world IP networks. In: Proc. of the 31 st IEEE LCN 2006. Tampa, 2006. http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=4116573.
  • 8Eerman J, Mahanti A, Arlitt M. Internct traffic identification using machine learning techniques. In: Proc. of the 49th IEEE GLOBECOM. San Francisco, 2006. http://pages.cpsc.ucalgary.ca/-mahanti/papers/globecom06.pdf.
  • 9Erman J, Arlitt M, Mahanti A. Traffic classification using clustering algorithms. In: Proc. of the ACM SIGCOMM Workshop on Mining Network Data (MineNet). Pisa, 2006. http://conferences.sigcomm.org/sigcomm/2006/papers/minenet-01.pdf.
  • 10Bernaille L, Teixeira R, Salamatian K. Early application identification. In: Proc. of the Conf. on Future Networking Technologies 2006 (CoNEXT 2006). Lisboa, 2006. http://portal.acm.org/citation.efm?id=1368445.

共引文献170

同被引文献18

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部