摘要
对等(P2P)覆盖网络作为一种典型的分布式系统日益受到人们的重视.P2P应用遍及文件共享、流媒体、即时通信等多个领域,P2P应用所产生的流量占据了互联网流量的60%以上.为了更好地管理和控制P2P流量,有必要对P2P流量识别模型进行深入的研究.提出一种基于小波支持向量机的机器学习模型(ML-WSVM)来识别已知和未知的P2P流量,ML-WSVM是通过满足小波框架和Mercer定理的小波基函数替换支持向量机核函数的方法,实现小波与支持向量机的结合.该模型充分利用了小波的多尺度特性与支持向量机在分类方面的优势.然后,提出基于损失函数的串行最小化算法来优化求解ML-WSVM的最优分类面.最后,理论分析和实验结果表明该方法大大提高了对P2P网络流量的识别精度和识别效率,尤其是对加密报文的识别.
Peer-to-peer (P2P) overlay networks are typical distributed systems in nature, which have attracted more and more attentions. At present, the P2P technology has been applied in file sharing, streaming media, instant messaging, and other fields. Besides, P2P network traffic accounts for more than 60% of Internet traffic. In order to better manage and control the P2P traffic, it is necessary to study a P2P traffic identification model in depth. Firstly, a machine learning model based on the wavelet support vector machine (ML-WSVM) is proposed to identify known and unknown P2P traffic. In the ML-WSVM model, the combination of the wavelet with the support vector machine is implemented by the wavelet basis function which satisfies the wavelet framework and the Mercer theorem instead of the existing support vector machine kernel functions. The proposed model makes full use of multi-scale features of the wavelet and the advantages of the support vector machine used in the classification. Then, the improved sequential minimization optimization (SMO) algorithm based on a loss function is proposed to solve the optimal hyperplane of the ML-WSVM model. Finally, the theoretical analysis and experimental results show that the ML-WSVM model can greatly improve the identification accuracy and identification efficiency of P2P network traffic, particularly to identify the encrypted packets.
出处
《计算机研究与发展》
EI
CSCD
北大核心
2011年第12期2253-2260,共8页
Journal of Computer Research and Development
基金
国家自然科学基金项目(60573141
60773041)
国家"八六三"高技术研究发展计划基金项目(2006AA01Z439
2007AA01Z404)
南京市高科技资助基金项目(2007软资127)
江苏省高校科技创新计划资助基金项目(CX10B_198Z)
关键词
对等网络
网络流量识别
支持向量机
小波函数
损失函数
peer to peer networks
network traffic identification
support vector machine
wavelet function
loss function