摘要
针对单一分类方法在训练样本不足的情况下对于小样本网络流分类效果差的特点,通过自适应增强(Adaptive Boosting,AdaBoost)算法进行流量分类。算法首先使用CFS(Correlation-based Feature Selection)特征选择方法从大量网络流特征中提取出少量高效的分类特征,在此基础上,通过AdaBoost算法组合决策树、关联规则和贝叶斯等5种单一分类方法实现流量分类。实际网络流量数据测试表明,基于AdaBoost的组合分类方法的准确率在所选的几种算法中是最高的,其能够达到98192%,且相对于单一的分类算法,组合流量分类方法对于小样本网络流的分类效果具有明显提升。
To cope with the poor performance of single classification algorithms on minority flows when the train dataset is deficient, the AdaBoost (Adaptive Boosting) algorithm is introduced to classify network traffic. On the basis of selecting few but effective classification features with CFS (Correlation-based Feature Selection) method from a variety of flow's features, the AdaBoost algorithm is used to combine five single classification algorithms which belong to Decision Tree, Rules and Bayes respectively for the sake of traffic classification. The experi- ment over real network traffic shows that the AdaBoost algorithm has the highest precision up to 98.92% among the selected classification algorithms. Moreover, the AdaBoost algorithm achieves great improvement on the per- formance of minority flows' classification compared with single classification algorithms.
出处
《电讯技术》
北大核心
2013年第9期1207-1212,共6页
Telecommunication Engineering
基金
陕西省自然科学基础研究计划重点项目(2012JZ8005)~~
关键词
网络流
流量分类
相关特征选择
自适应增强算法
组合分类器
network traffic
traffic classification
correlation-based feature selection
adaptive boosting algorithm
ensemble classifier