期刊文献+

基于Boosting的不平衡数据分类算法研究 被引量:16

Research on Boosting-based Imbalanced Data Classification
下载PDF
导出
摘要 研究基于boosting的不平衡数据分类算法,归纳分析现有算法,在此基础上提出权重采样boosting算法。对样本进行权重采样,改变原有数据分布,从而得到适用于不平衡数据的分类器。算法本质是利用采样函数调整原始boosting损失函数形式,进一步强调正样本的分类损失,使得分类器侧重对正样本的有效判别,提高正样本的整体识别率。算法实现简单,实用性强,在UCI数据集上的实验结果表明,对于不平衡数据分类问题,权重采样boosting优于原始boosting及前人算法。 This paper aimed to investigate boosting-based imbalanced data classification algorithms. Through the deep a- nalysis of existing algorithms, a weight-sampling boosting algorithm was proposed. Changing the data distribution by weight sampling, the trained classifier was made suitable for imbalanced data classification. The natural of the proposed algorithm is that the loss function of naive boosting is adjusted by the sampling function and the positive examples are emphasized so that the classifier focuses on correctly classifying these examples and finally the recognition rate of posi tive examples is improved. The new algorithm is simple and practical and has been shown to outperform naive boosting and previous algorithms in the problem of imbalanced data classification on the UCI data sets.
出处 《计算机科学》 CSCD 北大核心 2011年第12期224-228,共5页 Computer Science
基金 国家自然科学基金(60974129,70931002)资助
关键词 不平衡数据分类 BOOSTING 采样 Imbalanced data classification, Boosting, Sampling
  • 相关文献

参考文献13

  • 1高嘉伟,梁吉业.非平衡数据集分类问题研究进展[J].计算机科学,2008,35(4):10-13. 被引量:16
  • 2涂承胜,陆玉昌.Boosting视角[J].计算机科学,2005,32(5):140-143. 被引量:2
  • 3Mason L,Baxter J,Bartlett P,et al. Boosting algorithms as gra dient deseent[C] // Neural Information Processing Systems 12 Cambridge: MIT Press, 2000 : 512-518.
  • 4Friedman J, Hastie T, Tibshirani R. Additive logistic regression a statistical view of boosting[J]. The Annals of Statistics, 2000 28(2) : 337-407.
  • 5李正欣,赵林度.基于SMOTEBoost的非均衡数据集SVM分类器[J].系统工程,2008,26(5):116-119. 被引量:14
  • 6Seiffert C,Khoshgoftaar T M, Hulse J V, et al. RUSBoost: Im proving classification performance when training data is skewed [C]//Proceedings of 19th International Conference on Pattern Recognition. Washington DC: IEEE Computer Society, 2008:1-4.
  • 7Guo H Y,Viktor H L. Learning from imbalanced data sets with boosting and data generation: the DataBoost-IM approach[J]. SIGKDD Explorations, 2004,6 ( 1 ):30-39.
  • 8Sun Y,Kamel M S,Wong A K C, et al. Cost-sensitive boosting for classification of imbalanced data[J].Pattern Recognition, 2007,40(12) :3358-3378.
  • 9GE Jun-Feng LUO Yu-Pin.A Comprehensive Study for Its Application in Asymmetric AdaBoost and Object Detection[J].自动化学报,2009,35(11):1403-1409. 被引量:7
  • 10Li Q J, Mao Y B, Wang Z Q, et al. Cost-sensitive boosting: fit ring an additive asymmetric logistic regression model[C]//Proceedings of the 1st Asian Conference on Machine Learning: Advances in Machine Learning ( ACML ' 09 ). Berlin: Springer, 2009 : 234-247.

二级参考文献76

  • 1郑恩辉,李平,宋执环.不平衡数据知识挖掘:类分布对支持向量机分类的影响[J].信息与控制,2005,34(6):703-708. 被引量:17
  • 2谢纪刚,裘正定.非平衡数据集Fisher线性判别模型[J].北京交通大学学报,2006,30(5):15-18. 被引量:15
  • 3Schapire R E. A brief introduction to boosting. In: Proceedings of the 16th International Joint Conference on Artificial Intelligence. Stockholm, Sweden: Morgan Kaufmann Publishers, 1999. 1401-1406.
  • 4Viola P, Jones M J. Robust real-time face detection. International Journal of Computer Vision, 2004, 57(2): 137-154.
  • 5Huang C, Ai H Z, Li Y, Lao S H. High-performance rotation invaxiant multiview face detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29(4): 671-686.
  • 6Wu B, Ai H Z, Huang C, Lao S H. Fast rotation invariant multi-view face detection based on real adaboost. In: Proceedings of IEEE International Conference on Automatic Face and Gesture Recognition. Seoul, Korea: IEEE, 2004. 79-84.
  • 7Ma Y, Ding X Q. Real-time rotation invariant face detection based on cost-sensitive adaboost. In: Proceedings of IEEE International Conference on Image Processing. Barcelona, Spain: IEEE, 2003. 921-924.
  • 8Viola P, Jones M J, Snow D. Detecting pedestrians using patterns of motion and appearance. In: Proceedings of IEEE International Conference on Computer Vision. Nice, France: IEEE, 2003. 734-741.
  • 9Laptev I. Improvements of object detection using boosted histograms. In: Proceedings of British Machine Vision Conference. Edinburgh, UK: Springer, 2006. 949-958.
  • 10Zhu Q, Yeh M C, Cheng K T, Avidan S. Fast human detection using a cascade of histograms of oriented gradients. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition. New York, USA: IEEE, 2006. 1491-1498.

共引文献35

同被引文献192

引证文献16

二级引证文献137

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部