摘要
随着大数据时代的到来,网络安全中攻击总流量大幅上升,通过异常流量检测发现网络中的恶意流量成为当前亟需解决的问题.目前工业中使用的异常流量检测设备主要采用统计分析方法或简单的机器学习方法,存在网络流量数据量巨大、冗余的正常数据量较多、精准率较低、误报率较高等问题.针对此类问题,提出了一种作用于数据处理阶段的基于层次聚类的流量异常检测方法.该方法先使用层次聚类算法达到数据约减的目的,然后基于7种不同的机器学习算法构建了基于层次聚类的异常流量模型.实验结果表明,该方法在DARPA数据集上对异常行为的检测精准率可达到99%,召回率可达到99%.同时,经过数据约减后,仍能保持90%以上的精准率,极大地提升了检测效率.
With the advent of the big data era,the attacks in network traffic are rising dramatically.Detecting malicious traffic through abnormal flow detection is vital.Nowadays,the equipment of abnormal flow detection used in industry mainly adopts statistical analysis method or simple machine learning method.However,the amount of flow data and redundant data is large,the precision rate is low and the false alarm rate is high.In order to solve these problems,this paper presents a new method to detect flow anomalies based on hierarchical clustering in data processing.This method first uses the hierarchical clustering algorithm to achieve the purpose of data reduction.Then based on seven different machine learning algorithms,an abnormal traffic model based on hierarchical clustering is constructed.The experimental results show that this method can detect the abnormal behavior on the DARPA dataset with a precision rate of 99% and a recall rate of 99%.At the same time,after data reduction,it can still maintain the precision of more than 90%,which greatly improves the detection efficiency.
作者
蹇诗婕
卢志刚
姜波
刘玉岭
刘宝旭
Jian Shijie;Lu Zhigang;Jiang Bo;Liu Yuling;Liu Baoxu(Institute of Information Engineering,Chinese Academy of Sciences,Beijing100093;School of Cyber Security,University of Chinese Academy of Sciences,Beijing100049)
出处
《信息安全研究》
2020年第6期474-481,共8页
Journal of Information Security Research
基金
国家重点研发计划项目(2019QY1303,2019QY1302,2018YFB0803602)
中国科学院战略性先导科技专项(C类)项目(XDC02040100)。
关键词
流量异常检测
数据预处理
数据约减
层次聚类
机器学习方法
flow anomaly detection
data preprocessing
data reduction
hierarchical clustering
machinelearningmethods