期刊文献+

一种用于非平衡数据分类的集成学习模型 被引量:5

Ensemble learning model for imbalanced data classification
下载PDF
导出
摘要 针对非平衡数据分类问题,提出了一种改进的SVM-KNN分类算法,在此基础上设计了一种集成学习模型。该模型采用限数采样方法对多数类样本进行分割,将分割后的多数类子簇与少数类样本重新组合,利用改进的SVM-KNN分别训练,得到多个基本分类器,对各个基本分类器进行组合。采用该模型对UCI数据集进行实验,结果显示该模型对于非平衡数据分类有较好的效果。 For the issue of classification on imbalanced datasets,this paper presents an improved SVM-KNN classification algorithm.On this basis,an ensemble learning model is proposed.This model employs limited sampling to segment the majority class samples,re-combines the subset of majority class samples with the minority class samples,obtains several basic classifiers by training the combined subset based on improved SVM-KNN.These basic classifiers are integrated.Experimental results on UCI dataset show that this ensemble learning model has satisfactory performance when dealing with issue of classification on imbalanced datasets.
出处 《计算机工程与应用》 CSCD 2012年第29期119-123,219,共6页 Computer Engineering and Applications
基金 国家自然科学基金(No.61175048)
关键词 非平衡数据 集成学习模型 基本分类器 改进的支持向量机-K最近邻(SVM-KNN) UCI数据集 imbalanced data ensemble learning model basic classifier improved Support Vector Machine-K Near-est Neighbor(SVM-KNN) UCI dataset
  • 相关文献

参考文献17

  • 1Mazurowski M A, Habas P A.Training neural network classifiers for medical decision making: the effects of imbalanced datasets on classification performance[J]. Neural Networks,2008,21 (2) :427-436.
  • 2Padmaja T M, Dhulipalla N.Unbalanced data classification using extreme outlier elimination and sampling techniques for fraud detection[C]//15th International Conference on Advanced Computing and Communications, 2007 : 511-516.
  • 3Li Yanling, Zhu Yehang, Yang Ping.Text classificationg for imbalanced data sets[J].Information Science and En- gineering, 2008,2 (20/22) : 778-781.
  • 4曾志强,吴群,廖备水,高济.一种基于核SMOTE的非平衡数据集分类方法[J].电子学报,2009,37(11):2489-2495. 被引量:48
  • 5Sun Yanmin, Kamel M S, Wong A K C.Cost-sensitive boosting for classification of imbalanced data[J].Pattem Recognition, 2007,40 (12) :3358-3378.
  • 6高嘉伟,梁吉业.非平衡数据集分类问题研究进展[J].计算机科学,2008,35(4):10-13. 被引量:16
  • 7Batista G E A P A,Prati R C,Monard M C.A study of the behavior of several methods for balancing machine learning training data[J].ACM New York, 2004, 6 ( 1 ) : 20-29.
  • 8Chawla N V,Bowyer K W,Hall L O.SMOTE:synthetic minority over-sampling technique[J].Journal of Artificial Intelligence Research, 2002,16: 321-357.
  • 9Debray T.Classification in imbalanced datasets[D].Maas- tricht: Maastricht University, 2009.
  • 10翟云,杨炳儒,曲武.不平衡类数据挖掘研究综述[J].计算机科学,2010,37(10):27-32. 被引量:37

二级参考文献96

共引文献249

同被引文献56

引证文献5

二级引证文献69

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部