期刊文献+

基于马氏距离的重采样方法在流量识别中的应用 被引量:1

Application of Resampling Method Based on Mahalanobis Distance in Traffic Identification
下载PDF
导出
摘要 针对网络流量识别中的多分类数据分布不均衡的问题,本文提出了一种基于马氏距离的重采样方法。首先,将网络流量数据进行零均值化处理并转换至主成分空间;再根据少数类样本数据到集合中心点之间的马氏距离对其进行新样本的生成;之后将新生成的样本数据转换至原始空间并进行逆零均值化处理;最后返回所有新生成的样本数据。使用剑桥大学公共网络流量数据进行流量分类实验,实验结果表明该方法能够有效提升少数类的识别准确率,并且比现有的重采样方法和成本敏感方法能够获得更好的分类效果。 Aiming at the problems of multi-class imbalance of data distribution in traffic identification,this paper proposed a novel resampling method based on Mahalanobis distance.First,the network traffic data is normalized and transformed to the principal component space;second,a new sample is generated for a minority class based on the Mahalanobis distance from the samples to the center point of the data set;third the newly generated sample is then transfomed to the original space and performed an anti-normalization process;and finally,all the new samples are returned to original data set.The public Internet traffic traces of Cambridge University is used for traffic classification experiment,the results show that the proposed method can effectively improve the accuracy of the minority classes in traffic data sets,and it can obtain better classification performance than the existing resampling methods and cost-sensitive methods.
作者 时鸿涛 李洪平 刘竞 SHI Hong-Tao;LI Hong-Ping;LIU Jing(College of Information Science and Engineering,Ocean University of China,Qingdao 266100,China)
出处 《中国海洋大学学报(自然科学版)》 CAS CSCD 北大核心 2019年第8期136-141,共6页 Periodical of Ocean University of China
基金 国家高技术研究发展计划项目(2013AA09A506-4)资助~~
关键词 马氏距离 主成分分析 流量识别 多分类不均衡 重采样方法 Mahalanobis distance principal component analysis traffic identification multi-class imbalance resampling method
  • 相关文献

参考文献1

二级参考文献7

  • 1IANA.IANA Port Number List[EB/OL].(2011-01-04).http://www.iana.org/assignments/as-numbers/as-numbers.xml.
  • 2Madhukar A,Williamson C.A Longitudinal Study of P2P TrafficClassification[C]//Proc.of IEEE International Symposium onModeling Analysis,and Simulation of Computer and Telecommu-nication Systems.[S.l.]:IEEE Press,2006:179-188.
  • 3Subhabrata S,Spatscheck O,Wang Dongmei.Accurate,Scalablein Network Identification of P2P Traffic Using Application Signa-tures[C]//Proc.of the 13th International Conference on WorldWide Web.New York,USA:ACM Press,2004:512-520.
  • 4Won Y J,Park B C,Ju Hong-Taek,et al.A Hybrid Approach forAccurate Application Traffic Identification[C]//Proc.of the 4thIEEE/IFIP Workshop on End-to-end Monitoring Techniques andServices.Vancouver,Canada:[s.n.],2006:1-8.
  • 5Lelandw E,Taqqum S,Willinger W,et al.On the Self-similarNature of Ethernet Traffic[J].IEEE/ACM Transactions onNetworking,1994,2(1):1-15.
  • 6Abry P,Veitch P.Wavelet Analysis of Long-range DependenceTraffic[J].IEEE Transactions on Information Theory,1998,44(1):2-15.
  • 7张骏温,陈海文,陈常嘉.因特网业务量多重分形性本质成因的研究[J].软件学报,2002,13(3):470-474. 被引量:15

共引文献1

同被引文献4

引证文献1

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部