摘要
P2P流量逐渐成为互联网流量的重要组成部分,精确分类P2P流量对于有效管理网络和合理利用网络资源都具有重要意义。近年来,利用机器学习方法处理P2P流量分类问题已成为流量识别领域的一个新兴研究方向。利用决策树中的C4.5算法和P2P流量的特征属性来构建决策树模型,进而完成P2P流量分类问题。实验结果表明,基于决策树模型的方法能有效避免P2P网络流分布变化所带来的不稳定性;与SVM(support vectormachine,支持向量机)、NBK(na ve Bayes using kernel density estimation,改进的朴素贝叶斯)方法相比,其平均分类准确率能提高至少3.83个百分点。
P2P traffic has become one of the most significant portions of the network traffic. Accurate identification of P2P traffic makes great sense for efficient network management and reasonable utility of network resources. In recent years, P2P traffic classification using machine learning has been a new direction in traffic identification. This paper proposed a new method based on decision-tree model, using CA. 5 and P2P traffic characteristic. The experiments show this method can effectively avoid the instability of P2P traffic distribution change. Compared with SVM and NBK method, the average of classified precision can increase at least 3.83 percentage points.
出处
《计算机应用研究》
CSCD
北大核心
2009年第12期4690-4693,共4页
Application Research of Computers
关键词
对等网
流量特征
决策树
流量分类
C4.5
P2P
traffic characteristic
decision-tree
traffic classification
C4.5