期刊文献+

面向多类不均衡网络流量的特征选择方法 被引量:8

Feature selection for multi-class imbalanced Internet traffic
下载PDF
导出
摘要 针对网络流量分类中的多类不均衡问题,提出一种基于相对不确定性和对称不确定性的Hybrid型特征选择方法。首先,利用相对不确定性为每个类选择候选特征集;然后,保留每个候选特征集中对称不确定性较高的特征并去除其他特征;最后,利用基于C4.5决策树的wrapper型特征选择方法确定最优特征子集。在真实网络流量数据集上的实验结果表明,与传统方法相比,该方法具有较高的整体准确率、小类召回率和g-mean值,从而可以减轻多类不均衡问题带来的不良影响。 To solve the multi-class imbalance problem in Internet traffic classification, this paper proposed a new hybrid feature selection approach based on relative uncertainty and symmetric uncertainty. Firstly,it used the relative uncertainty value to select candidate feature subset for each class. Then, for each candidate feature subset, it preserved the features with high sym- metric uncertainty value while discarded others. Finally, it selected the optimal feature subset through the wrapper approach based on C4.5 decision tree. The experimental results on real world Internet traffic data sets show that compared with tradition- al feature selection approaches, it leads to higher overall accuracy, recall of minority classes and g-mean value, which can re- duce the adverse effect caused by muhi-class imbalance.
出处 《计算机应用研究》 CSCD 北大核心 2017年第2期568-571,594,共5页 Application Research of Computers
基金 国家自然科学基金资助项目(61501289) 国家自然科学青年基金资助项目(61302093) 国家教育部高等学校博士学科点专项基金资助项目(20133108120018) 上海市科委重大项目(14511101505) 中科院先导专项“未来网络系统架构与关键技术研究”子课题资助项目(XDA06010301) 上海市科学技术委员会“扬帆计划”资助项目(14YF1408900)
关键词 网络流量 多类不均衡 特征选择 相对不确定性 对称不确定性 Internet traffic multi-class imbalance feature selection relative uncertainty symmetric uncertainty
  • 相关文献

参考文献2

二级参考文献16

  • 1张艳阳,顾明.基于AdaBoost分类器的车牌字符识别算法研究[J].计算机应用研究,2006,23(5):242-243. 被引量:3
  • 2任江涛,黄焕宇,孙婧昊,印鉴.基于遗传算法及聚类的基因表达数据特征选择[J].计算机科学,2006,33(9):155-156. 被引量:4
  • 3WANG Jian-min, QIAN Cheng-lu, CHE Chun-hui, et al. Study on process of network traffic classification using machine leamingf C]// Proc of the 5th Annual ChinaGrid Conference. 2005 :262-266.
  • 4MOORE A W,ZUEV D,CROGAN M. Discriminators for use in flow-based classification [ M ]. London : Queen Mary University of London,2005.
  • 5MOORE A W, PAPAGIANNAKI K. Toward the accurate identification of network application [ C ] //Proc of Passive & Active Measurement Workshop 2005. Boston: Springer-Verlag,2005 :41-54.
  • 6FREUND Y,SCHAPIRE R E. A decision-theoretic generalization of on-line learning and an application to boosting[ J]. Journal Of Computer and System Sciences, 1997,55(1): 119-139.
  • 7FRIEDMAN J,HASTIE T, TIBSHIRANI R. Additive logistic regression :a statistical view of boosting[ J]. The Annals of Statistics, 2000,28(2) :337-407.
  • 8SCHAPIRE R E, SINGER Y. Improved boosting algorithms using confidence-rated predictions [ J] Machine Learning, 1999 ,37 ( 3 ): 297-336.
  • 9SHAN Shi-guang, YANG Peng, CHEN Xi-lin, et al. AdaBoost gabor fisher classifier for face recognition [ J ]. Computer Science,2005, 32(3) ; 279-292.
  • 10林平,余循宜,刘芳,雷振明.基于流统计特性的网络流量分类算法[J].北京邮电大学学报,2008,31(2):15-19. 被引量:21

共引文献5

同被引文献34

引证文献8

二级引证文献26

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部