摘要
随着互联网技术的不断发展以及网络规模的不断扩大,新的网络业务层出不穷,为了保障用户服务质量,准确快速地对业务流量进行分类是目前的研究重点。传统业务识别方法多以协议或具体业务为分类依据,应用性较低。文章结合业务流量特征和机器学习方法,提出了一种基于生成对抗网络(GAN)和极端梯度增强(XGBoost)融合的业务流量识别方法。该方法首先提取代表业务资源需求的流量特征;然后通过改进GAN算法扩充少数类样本,解决业务识别过程中出现的数据集分布不平衡导致的模型准确率低的问题;最后通过随机森林算法进行特征选择,并利用XGBoost算法完成模型训练。结果表明,该方法对业务识别的准确率达到了97.32%。
With the continuous development of Internet technology and the continuous expansion of network scale,new network services emerge in an endless stream.In order to ensure the quality of user service,accurate and rapid classification of application traffic is the focus of current research.The traditional service identification method is based on protocol or specific service classification,which is suffered from low applicability.Combining traffic characteristics and machine learning methods,this paper proposes a traffic identification method based on the fusion of Generative Adversative Network(GAN)and Extreme Gradient Lift Boosting(XGBoost).Firstly,the traffic characteristics representing service resource requirements.Then GAN algorithm was improved to expand a few class samples to solve the problem of low model accuracy caused by the unbalanced distribution of data sets in the process of application identification.Finally,the random forest algorithm was used to select the feature,and the XGBoost algorithm was used to complete the model training.The results show that the accuracy of this method is 97.32%.
作者
关其峰
赵夙
朱晓荣
GUAN Qi-feng;ZHAO Su;ZHU Xiao-rong(Jiangsu Key Laboratory of Wireless Communication,Nanjing University of Posts and Telecommunications,Nanjing 210003,China)
出处
《光通信研究》
2023年第3期19-23,共5页
Study on Optical Communications
基金
国家自然科学基金资助项目(61871237,92067101)
江苏省重点研发计划资助项目(BE2021013-3)。