期刊文献+

An Imbalanced Data Classification Method Based on Hybrid Resampling and Fine Cost Sensitive Support Vector Machine 被引量:1

下载PDF
导出
摘要 When building a classification model,the scenario where the samples of one class are significantly more than those of the other class is called data imbalance.Data imbalance causes the trained classification model to be in favor of the majority class(usually defined as the negative class),which may do harm to the accuracy of the minority class(usually defined as the positive class),and then lead to poor overall performance of the model.A method called MSHR-FCSSVM for solving imbalanced data classification is proposed in this article,which is based on a new hybrid resampling approach(MSHR)and a new fine cost-sensitive support vector machine(CS-SVM)classifier(FCSSVM).The MSHR measures the separability of each negative sample through its Silhouette value calculated by Mahalanobis distance between samples,based on which,the so-called pseudo-negative samples are screened out to generate new positive samples(over-sampling step)through linear interpolation and are deleted finally(under-sampling step).This approach replaces pseudo-negative samples with generated new positive samples one by one to clear up the inter-class overlap on the borderline,without changing the overall scale of the dataset.The FCSSVM is an improved version of the traditional CS-SVM.It considers influences of both the imbalance of sample number and the class distribution on classification simultaneously,and through finely tuning the class cost weights by using the efficient optimization algorithm based on the physical phenomenon of rime-ice(RIME)algorithm with cross-validation accuracy as the fitness function to accurately adjust the classification borderline.To verify the effectiveness of the proposed method,a series of experiments are carried out based on 20 imbalanced datasets including both mildly and extremely imbalanced datasets.The experimental results show that the MSHR-FCSSVM method performs better than the methods for comparison in most cases,and both the MSHR and the FCSSVM played significant roles.
出处 《Computers, Materials & Continua》 SCIE EI 2024年第6期3977-3999,共23页 计算机、材料和连续体(英文)
基金 supported by the Yunnan Major Scientific and Technological Projects(Grant No.202302AD080001) the National Natural Science Foundation,China(No.52065033).
  • 相关文献

参考文献4

二级参考文献23

  • 1李小文,王锦地,项月琴,A.H.Strahler.用广角图象同时测量针叶树冠层叶角分布和叶面积指数的方法研究(英文)[J].遥感学报,1997,1(S1):62-69. 被引量:5
  • 2凌晓峰,SHENG Victor S..代价敏感分类器的比较研究(英文)[J].计算机学报,2007,30(8):1203-1212. 被引量:35
  • 3R Batuwita, V Palade. FSVM-CIL: Fuzzy support vector machines for class imbalance learning [ J]. IEEE Transactions on Fuzzy Systems,2010,18(3) :558 - 571.
  • 4U Brefeld, P Geibel, et al. Support vector machines with example dependent costs [A]. Proceedings of the European Conference on Machine Learning [C]. Gavtat-Dubrovnik, Croatia, 2003.23 - 34.
  • 5N V Chawla, N Japkowicz, et al. Editorial to the special issue on learning from imbalanced data set[ J]. ACM SIGKDD Explorations,2004,6(1) : 1 - 6.
  • 6C Elkan. The foundations of cost-sensitive learning [ A ]. Proceedings of the 17th International Joint Conference on Artificial Intelligence[C]. San Francisco, CA, USA, 2001.973 - 978.
  • 7X Y Liu,Z H Zhou. The influence of class imbalance on costsensitive learning: An empirical study[ A ]. Proceedings of the 6th IEEE International Conference on Data Mining[C]. Hong Kong, China, 21306.970 - 974.
  • 8Y F Li,J Kwok, et al. Cost-sensitive semi-supervised support vector machine[ A]. Proceedings of the 24th AAAI Conference on Artificial Intelligence (AAAI' 10) [ C] .Atlanta, GA,2010. 500 - 505.
  • 9K Morik, P Brochhausen, et al. Combining statistical learning with a knowledge-based approach: A case study in intensive care monitoring [ A] .Proceedings of 16th International Conference on Machine Learning [ C ]. San Francisco, CA, USA, 1999.268 - 277.
  • 10L Qiao,S Chen, et al. Sparsity preserving discriminant analysis for single training image face recognition [J]. Pattem Recognition Letters, 2010,31 (5) : 422 - 429.

共引文献45

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部