摘要
为了提高入侵检测模型的准确率和泛化性,需要对集成学习系统的多样性进行改善.因此将样本扰动与特征扰动结合,对大规模数据集进行分割,构造不同的样本子集,提高集成个体之间的差异,并在特征扰动阶段,采用主成分分析以减少信息冗余,根据信息增益设置特征权重和信息增益比,对特征子集进行自适应随机搜索,以提高集成系统的多样性.通过KDD Cup99数据集进行了仿真实验,结果表明,采用入侵检测模型能够对超大规模数据进行有效学习,且对各种攻击行为的检测准确率都较高.
In order to improve the accuracy and generalization of the intrusion detection model,it is necessary to improve the diversity of the integrated learning system.In this paper,the sample perturbation is combined with the feature perturbation.Firstly,the large-scale data set is segmented,different sample subsets are constructed to improve the difference between the integrated individuals,and in the feature disturbance phase,principal component analysis is used to reduce information redundancy,and according to the information.The gain sets the feature weight and the information gain ratio,and performs adaptive random search on the feature subset to improve the diversity of the integrated system.The simulation experiment is carried out by KDD Cup99 dataset.The results show that the intrusion detection model used in this paper can effectively study hyper-large-scale data,and the detection accuracy of various attack behaviors is high.
作者
张康宁
廖光忠
ZHANG Kang-ning;LIAO Guang-zhong(Institute of Computer Science and Technology,Wuhan University of Science and Technology,Wuhan 430081,China;Hubei Key Laboratory of Intelligent Information Processing and Real-time Industrial Systems,Wuhan University of Science and Technology,Wuhan 430081,China)
出处
《东北师大学报(自然科学版)》
CAS
北大核心
2020年第4期53-59,共7页
Journal of Northeast Normal University(Natural Science Edition)
基金
国家自然科学基金资助项目(61502359)
智能信息处理与实时工业系统湖北省重点实验室开发项目(2016znss04A).
关键词
入侵检测
集成学习
信息增益
多样性
intrusion detection
integrated learning
information gain
variety