摘要
针对目前存在的字典学习方法不能有效的构造具有鉴别能力的结构化字典,并且忽视了由于正负类样本数据不均衡造成的代价不同问题,提出了基于代价敏感的鉴别字典学习方法,并将其用于网络入侵检测。首先,重新构建稀疏表示模型,通过在目标函数中设计约束的鉴别项学习具有鉴别性质的字典;其次,考虑到数据集中入侵数据和非入侵数据不平衡,二者的检测代价是不同的,引入代价敏感矩阵来考虑不同的误检测行为代价对于分类性能造成的影响。选择经过预处理的KDD99网络入侵数据集作为实验数据,引入召回率、查准率、错误接受率以及F-measure等指标进行分类器性能评估,并与支持向量机、决策树以及聚类分析等机器学习算法进行实验对比发现,CS-DDL能够较好的改善分类器的性能。
Focusing the issue that sparse representation method can not effectively construct discriminant structured dictionary and neglect the influence to classification result of imbalance by positive and negative samples, proposed dictionary learning based on cost sensitive. Firstly, redesign the sparse representation model to construct structured dictionary by constraint discriminant in the object function;secondly, consider that intrusion samples and non-invasive samples are imbalance, propose cost sensitive matrix to take the misclassification to the detect result into account. Compared with machine learning algorithm such as SVM, Decision Tree and Cluster Analysis on the KDD99 dataset and measured with recall rate, precision rate, false accept rate and F-measure, CS-DDL can obviously improve the classification performance better.
出处
《科技通报》
北大核心
2017年第12期162-166,共5页
Bulletin of Science and Technology
基金
中央高校基本科研业务费专项资金项目(LGYB201605)
关键词
入侵检测
代价敏感
字典学习
分类性能
机器学习
intrusion detection
cost sensitive
dictionary learning
classification performance
machine learning