摘要
网络流量分类是网络监控、服务质量和网络安全的前提和基础。为了对海量网络流量进行迅速、准确地分类,利用相似性和权重对随机森林算法进行改进,通过计算决策树相似度,消除决策冗余以加强分类性能;再用分类性能为指标设定权重构建随机森林,并结合Spark平台设计实现并行算法提高分类效率。实验结果表明,该方法提高了网络流量分类性能,同时具有可扩展性和顽健性,能够响应海量流量分类任务。
Network traffic classification(NTC)is the premise and foundation of detecting network monitoring,Quality-of-Service(QoS)management and network security.In order to classify the vast network traffic quickly and accurately,the random forest classification algorithm based on similarity and weights was improved,which eliminate decision-making redundancy through the similarity.Then classification performance index enactment right heavy was used to build a random forest.Combined with Spark platform design and implement parallel algorithm.The experiments show that our method improves the classification accuracy,has scalability and robustness,and can respond to the massive traffic classification task.
作者
刘兆禄
赵英
刘淑梅
LIU Zhaolu;ZHAO Ying;LIU Shumei(College of Information Science and Technology,Beijing University of Chemical Technology,Beijing 100029,China)
出处
《通信学报》
EI
CSCD
北大核心
2018年第A01期30-36,共7页
Journal on Communications