摘要
针对网络流量分类中类不均衡问题,提出一种基于K均值和k近邻的流量分类算法(traffic classification based on K-means and k nearest neighbor,KMk NN);以KMk NN为基础设计了一种集成分类器(ensemble classifier based on KMk NN,KKEC)。首先通过抽取不同的输入特征子集分别进行训练,获得不同的分类器,进而采取绝对多数与相对多数相结合的投票方式产生集成输出结果,最后采用非平衡数据集进行测试。理论分析和实验结果都表明,算法在面对非均衡协议流时提高了小类流的识别率。
In order to solve the problem of imbalanced protocol flows, a traffic identification method based on K-means and k nearest neighbor (KMkNN) is proposed, on this basis, an ensemble classifier (KKEC) is presented. The different subsets of features are extracted to train different classifiers, and then the ensemble output is obtained by voting method combing absolute majority with relative majority, finally experiments are carried out on imbalance datasets. Theoretical analysis and experimental results show that the algorithm can improve the recognition rate of minority flows in the case of the imbalanced protocol flows.
出处
《信息工程大学学报》
2015年第2期240-244,共5页
Journal of Information Engineering University
基金
国家科技重大专项资助项目(2010ZX0300602-001)
关键词
集成学习
流量分类
非均衡
ensemble learning
traffic classification
imbalance