期刊文献+

ADASYN和SMOTE相结合的不平衡数据分类算法 被引量:15

Unbalanced Data Classification Algorithm Based on Combination of ADASYN and SMOTE
下载PDF
导出
摘要 传统支持向量机(SVM)对不平衡数据进行二分类时,存在分类边界容易偏移的问题。目前,对于不平衡数据问题主要从数据集和算法两方面来解决。提出了一种基于数据集方法是采用ADASYN和SMOTE算法来联合生成小类样本点。上述方法是根据K近邻算法计算小类样本点和大类样本点数目,对小样本点进行分类后分别采用ADASYN和SMOTE算法进行小类样本点合成。最后实验对算法验证,结果采用ROC曲线来比较单独采用SMOTE或者ADASYN算法合成小类样本点,文中介绍的算法具有最高AUC值,由此可见提出的算法可以提高不平衡数据分类的有效性。 When the traditional support vector machine(SVM)classifies the unbalanced data,there is a problem that the classification boundary is easily offset.At present,the problem of unbalanced data is mainly solved from two aspects of data sets and algorithms.This paper proposes a data set based method that uses ADASYN and SMOTE algorithms to jointly generate small class sample points.The method calculated the number of small sample points and large sample points according to the nearest neighbor algorithm,and classified the small sample points and then used the ADASYN and SMOTE algorithms to perform small sample point synthesis.Finally,the experiment verifiesd the algorithm.The ROC curve was used to compare the SMOTE or ADASYN algorithm to synthesize small sample points.The algorithm introduced in this paper has the highest AUC value.The proposed algorithm can improve the classifica?tion of unbalanced data.
作者 蒋华 江日辰 王鑫 王慧娇 JIANG Hua;JIANG Ri-chen;WANG Xin;WANG Hui-jiao(School of Computer and Information Security,Guilin University of Electronic Technology,Guilin Guangxi 541000,China)
出处 《计算机仿真》 北大核心 2020年第3期254-258,420,共6页 Computer Simulation
基金 2016广西高校中青年教师基础能力提升项目(ky2016YB150) 桂林电子科技大学研究生教育创新计划项目(2017YJCX48)。
关键词 不平衡数据 支持向量机 分类算法 Imbalance dataset SVM Classification algorithm
  • 相关文献

参考文献13

二级参考文献104

  • 1肖智,王明恺,谢林林.基于支持向量机的大学生助学贷款个人信用评价[J].清华大学学报(自然科学版),2006,46(z1):1120-1124. 被引量:20
  • 2吴旗,刘健男,寇文龙,张宗升.改进的单类支持向量机的网络流量检测[J].吉林大学学报(工学版),2013,43(S1):124-127. 被引量:3
  • 3刘胥影,吴建鑫,周志华.一种基于级联模型的类别不平衡数据分类方法[J].南京大学学报(自然科学版),2006,42(2):148-155. 被引量:23
  • 4凌晓峰,SHENG Victor S..代价敏感分类器的比较研究(英文)[J].计算机学报,2007,30(8):1203-1212. 被引量:35
  • 5Chawla N V, Bowyer K, Hall L, et al. SMOTE: Synthetic Mino- rity Over-sampling Technique[J]. Journal of Artificial Intelli- gence Research, 2002,16(1) : 321-357.
  • 6Tomek I. Two modifications of CNN[J]. IEEE Transaction on Systems, Man and Communications, 1976,26 (1) : 769-772.
  • 7Kermanidis K, Maragoundakis K, Fakotakis N, et al. Learning greek verb complements: addressing the class imbalance[C]//'Procee- dings of the 20th International Conference on Computational Linguistics. Geneva, Switzerland, 2004 : 1065-1071.
  • 8Yen Show-jane, Lee Yue-shi. Under-sampling approaches for improving prediction of the minority class in an imbalaneed data- set[C]//Proceedings of Intelligent Control and Automation,Se- ries: I.ecture Notes in Control and Information Sciences. Berlin/ Heidelberg: Springer, 2006 : 731-740.
  • 9Tang Y, Zhang Y Q, Chawla N V, et al. SVMs modeling for highly imbalanced classifications[J]. IEEE Transaction on Sys- tems, Man, and Cybernetics, Part B: Cybernetics, 2009,39 ( 1 ) : 281-288.
  • 10Ertekin S, Huang J,Bottou L, et al. Learning on the border: ac tive learning in imbalanced data classification[C]//Proceedings of the ACM Conference on Information and Knowledge Manage- ment. Lisbon, Portugal, 2007 : 127-136.

共引文献247

同被引文献120

引证文献15

二级引证文献87

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部