摘要
针对传统AdaBoost算法中样本权值更新缺陷造成的分类准确率降低,以及冗余弱分类器造成的分类速度慢、计算开销大等问题,提出一种基于改进权值更新和选择性集成的AdaBoost算法。首先,在弱分类器训练阶段,提出一种改进权值更新方式的AdaBoost算法,根据各个样本在前t次训练中的平均正确率更新样本权值,使所有样本的权值更新更均衡,在一定程度上抑制了噪声样本权值的无限扩大;其次,在弱分类器组合阶段,提出一种新的弱分类器相似度度量方式,并基于该度量方式和层次聚类算法进行选择性集成,剔除了冗余的弱分类器,提高了分类速度,减少了计算开销;最后使用KDDCUP99、waveform和image-segmentation三个数据集对所提方案进行性能仿真与验证,分类准确率分别达到99.51%、86.07%和94.45%。实验表明,将改进权值更新和选择性集成的AdaBoost算法应用于入侵检测系统,不仅提高了分类准确率和检测速度,而且降低了计算开销。
Aiming at the problem in traditional AdaBoost algorithm that the classification accuracy rate decreases due to the defect of its weight update method,and that the classification speed is low and the computational cost is high which are caused by redundant weak classifier,we propose an AdaBoost algorithm based on improved weight update method and selective ensemble.Firstly,at the stage of training weak classifiers,an improved AdaBoost algorithm is proposed,which updates the weight of each sample according to its average accuracy of previous t trainings,so that the weights of samples can be updated more evenly.Moreover,this method,to some extent,inhibits the infinite expansion of the weights of noise samples.Secondly,at the stage of combing weak classifiers,a novel similarity measurement between weak classifiers is proposed.And based on this similarity measurement and hierarchical clustering algorithm,the selective ensemble is performed to eliminate redundant weak classifiers.Through this method,the classification speed increases and the computational overhead reduces.Finally,the proposed scheme is simulated and verified based on three data sets:KDDCUP99,waveform and image-segmentation.The accuracies of the three datasets are 99.51%、86.07%and 94.45%.The experimental results show that the improved AdaBoost algorithm can not only improve the classification accuracy and classification speed,but also reduce the computational cost.
作者
欧阳潇琴
王秋华
OUYANG Xiao-qin;WANG Qiu-hua(School of Communication Engineering,Hangzhou Dianzi University;School of Cyberspace Security,Hangzhou Dianzi University,Hangzhou 310018,China)
出处
《软件导刊》
2020年第4期257-262,共6页
Software Guide
基金
浙江省自然科学基金项目(LY19F020039)
之江实验室重大科研项目(2019DH0ZX01)。