摘要
针对入侵检测数据集维数高,导致检测算法处理速度慢,而其中包含许多对检测效果影响不大的特征的问题,提出了一种分步特征选择算法。它通过对相关特征和冗余特征的定义,以互信息为准则,首先删除不相关特征,然后删除冗余特征。该算法的时间复杂性低,且独立于检测算法,可以通过调整阈值平衡检测精度和特征的数量。以权威数据集KDD-99为实验数据集,对多种检测算法进行了实验。结果表明,该算法能有效地选择特征向量,保证检测精度,提高检测速度。
The intrusion detection data set is high dimensional,which leads to low processing speed for intrusion detection algorithms,but it holds many features affecting little for detection.To address the above issue,a step feature selection algorithm is proposed in this paper.Depending on the definition of relevant feature and redundant feature and using mutual information as criterion,it firstly removes the irrelevant features and then removes the redundant features.With low time complexity,the feature selection algorithm independent of detection algorithm can easily balance the detection accuracy and the number of features through threshold.Experiments over networks connection records from KDD-99 data set are implemented for many detection algorithms to evaluate the proposed method.The results show the algorithm can effectively select features,ensure detection accuracy and improve processing speed.
出处
《计算机工程与应用》
CSCD
北大核心
2010年第11期81-84,87,共5页
Computer Engineering and Applications
基金
上海高校选拔培养优秀青年教师科研专项基金No.YYY-07008
上海应用技术学院引进人才科研启动项目No.YJ2007-24
上海应用技术学院计算机科学与技术重点学科资助~~
关键词
入侵检测
特征选择
互信息
马尔可夫毯
intrusion detection
feature selection
mutual information
Markov blanket