期刊文献+

集成随机森林的分类模型 被引量:20

Classification model based on ensemble random forests
下载PDF
导出
摘要 与集成学习相比,针对单个分类器不能获得相对较高而稳定的准确率的问题,提出一种分类模型。该模型可集成多个随机森林,并以带阈值的多数投票法作为结合方法;模型实现主要分为建立集成分类模型、实例初步预测和结合分析三个层次。MapReduce编程方式实现的分类模型以P2P流量识别为例,分别与单个随机森林和集成其他算法进行对比,实验表明提出模型能获得更好的P2P流量识别综合分类性能,该模型也为二类型分类提供了一种可行的参考方法。 Compared to ensemble learning, this paper proposed a classification model to solve the problems of relatively low and unstable accuracy in a single classifier. This model integrated multiple random forests and used majority voting method with thresholds as combination method. The implementation of this model mainly consisted of three levels, that were building the integrated classification model, the preliminary prediction of instances and combination analysis. This classification model, which had a MapReduce programming mode implementation, took P2P traffic identification as an example. This paper compared the classification model respectively with single random forests and integration of other algorithms. Finally, the experiments show that the proposed model not only has better comprehensive performance in P2P traffic identification, but also provides a viable reference method for two-class classification.
出处 《计算机应用研究》 CSCD 北大核心 2015年第6期1621-1624,1629,共5页 Application Research of Computers
基金 国家科技重大专项子课题资助项目(2012ZX03005002-005) 重庆市应用开发计划资助项目(cstc2013yykf A40006) 2013年重庆高校创新团队建设计划资助项目(KJTD201312)
关键词 集成学习 随机森林 带阈值的多数投票法 MAPREDUCE P2P流量识别 ensemble learning random forests majority voting with thresholds MapReduce P2P traffic identification
  • 相关文献

参考文献29

  • 1Wang Yu,Xiang Yang,Yu Shunzheng.Internet traffic classification using machine learning:a token-based approach[C]//Proc of the 16th IEEE International Conference on Computational Science and Engineering.[S.l.]:IEEE Press,2013:285-289.
  • 2刘忠宝,赵文娟,师智斌.基于分类超平面的非线性集成学习机[J].计算机应用研究,2013,30(5):1361-1364. 被引量:2
  • 3闫友彪,陈元琰.机器学习的主要策略综述[J].计算机应用研究,2004,21(7):4-10. 被引量:56
  • 4Wolpert D H.Stacked generalization[J].Neural Networks,1992,5(2):241-259.
  • 5Quinlan J R.Bagging,boosting,and C4.5[C]//Proc of the 13th National Conference on Artificial Intelligence.[S.l.]:AAAI,1996:725-730.
  • 6Breiman L.Bagging predictors[J].Machine Learning,1996,24(2):123-140.
  • 7Dietterich T G.Ensemble methods in machine learning[M]//Multiple Classifier Systems.Berlin:Springer,2000:1-15.
  • 8Guo Jingming,Lin Chenchi,Chang Chehao,et al.Face gender recognition with halftoning-based AdaBoost classifiers[C]//Proc of IEEE International Symposium on Circuits and Systems.[S.l.]:IEEE Press,2013:2497-2500.
  • 9Connolly J F,Granger E,Sabourin R.Dynamic multi-objective evolution of classifier ensembles for video face recognition[J].Applied Soft Computing,2013,13(6):3149-3166.
  • 10Glodek M,Reuter S,Schels M,et al.Kalman filter based classifier fusion for affective state recognition[M]//Multiple Classifier Systems.Berlin:Springer,2013:85-94.

二级参考文献75

  • 1SEN S, WANG Jia. Analyzing peer-to-peer traffic across large networks [J]. IEEE/ACM Trans on Networking ,2004,12(2) :219-232.
  • 2GERBERAND A, HOULE J, NGUYEN H,et al. P2P the gorilla in the cable [ C ]//Proc of National Cable and Telecommunications Association(NCTA) , National Show. 2003.
  • 3SEN S, SPATSCKECK O, WANG D. Accurate, scalable in-network identification of P2P traffic using application signatures [ C ]//Proc of the 13th International World Wide Web Conference. 2004: 512-521.
  • 4LIU H, SETIONO R. A probabilistie approach to feature selection [ C]//Proc of International Conference on Machine Learning. 1996: 319-327.
  • 5DAS S. Filters, wrappers and a boosting based hybrid for feature selection[ C]//Proc of the 8th International Conference on Machine Learning. 2001:74-81.
  • 6YUAN Huang, TSENG S S, WU Gang-shan, et al. A two-phase feature selection met hod using both filter and wrapper [ C ]//Proc of IEEE International Conference on Systems, Man, and Cybernetics. 1999 : 132-136.
  • 7KOHAVI R, JOHN G H. Wrappers for feature subset selection [ J ]. Artificial Intelligence Journal, 1997,97( 1-2 ) :273-324.
  • 8KONONENKO I. Estimation attributes: analysis and extensions of RELIEF[ C ]//Proc of European Conference on Machine Learning. 1994 : 171-182.
  • 9HALL M A. Correlation-based feature selection for discrete and numeric class machine learning [ C ]//Pmc of the 17th International Conference on Machine Learning. San Francisco: Morgan Kaufmann Publishers, 2000:359-366.
  • 10BOSER B E, GUYON I, VAPNIK V. A training algorithm for optimal margin classifiers [ C ]//Proe 'of the 5th Annual Workshop on Computa-tional Learning Theory. [ S. l ] : ACM Press, 1992: 144- 152.

共引文献64

同被引文献161

引证文献20

二级引证文献126

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部