摘要
利用网络流量的统计特征进行流量的分类识别需要从众多的特征中选取最优特征集合,以避免冗余和不相关特征造成的系统模型复杂度过高、分类准确率和效率下降等问题。针对该问题,提出一种基于统计排序的网络流量特征选择方法。首先利用基于统计方法定义的特征选择系数生成初始特征子集,再将基于分类准确率构建的特征影响系数作为特征评估排序的依据,对初始特征子集进行二次特征选择,生成最优特征子集。实验结果表明,该方法在保证分类整体准确率的同时有效减少了流量统计特征的个数,在分类效果、效率以及稳定性之间实现了较好的平衡。
It is required to select the best features from so many ones in order to avoid the high complexity of the model, the low classification accuracy and efficiency caused by redundant and irrelevant features, if network traffic classification is obtained by us-ing the statistical characteristics. To solve the problem, a network traffic feature selection method based on statistic and ranking is proposed, according to generate the initial feature subset by using feature selection coefficient defined by statistic and then generate the optimal feature subset through the second feature selection of the initial feature subset by using feature influence coefficient defined by classification accuracy as the reference of extraction and ranking. Experimental results show that the proposed algorithm can reduce the number of features effectively while ensuring the overall classification accuracy and a good balance is achieved be-tween classification effectiveness, efficiency and stability.
出处
《电子技术应用》
2018年第1期84-87,共4页
Application of Electronic Technique
基金
国家计算机网络与信息安全技术研究专项(242研究计划)(2016QN027)
关键词
网络流量分类
特征选择
统计排序
特征影响
network traffic classification
feature selection
statistic and ranking
feature influence