Imbalanced Data Classification Using SVM Based on Improved Simulated Annealing Featuring Synthetic Data Generation and Reduction 被引量：1

下载PDF

导出

摘要 Imbalanced data classification is one of the major problems in machine learning.This imbalanced dataset typically has significant differences in the number of data samples between its classes.In most cases,the performance of the machine learning algorithm such as Support Vector Machine(SVM)is affected when dealing with an imbalanced dataset.The classification accuracy is mostly skewed toward the majority class and poor results are exhibited in the prediction of minority-class samples.In this paper,a hybrid approach combining data pre-processing technique andSVMalgorithm based on improved Simulated Annealing(SA)was proposed.Firstly,the data preprocessing technique which primarily aims at solving the resampling strategy of handling imbalanced datasets was proposed.In this technique,the data were first synthetically generated to equalize the number of samples between classes and followed by a reduction step to remove redundancy and duplicated data.Next is the training of a balanced dataset using SVM.Since this algorithm requires an iterative process to search for the best penalty parameter during training,an improved SA algorithm was proposed for this task.In this proposed improvement,a new acceptance criterion for the solution to be accepted in the SA algorithm was introduced to enhance the accuracy of the optimization process.Experimental works based on ten publicly available imbalanced datasets have demonstrated higher accuracy in the classification tasks using the proposed approach in comparison with the conventional implementation of SVM.Registering at an average of 89.65%of accuracy for the binary class classification has demonstrated the good performance of the proposed works.

作者 Hussein Ibrahim Hussein Said Amirul Anwar Muhammad Imran Ahmad

机构地区 Department of Computer Techniques Engineering Faculty of Electronic Engineering&Technology

出处《Computers, Materials & Continua》 SCIE EI 2023年第4期547-564,共18页 计算机、材料和连续体（英文）

关键词 Imbalanced data resampling technique data reduction support vector machine simulated annealing

分类号 TP181 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

同被引文献3

1王成,刘亚峰,王新成,闫桂荣.分类器的分类性能评价指标[J].电子设计工程,2011,19(8):13-15. 被引量：29
2万建武,杨明,陈银娟.代价敏感的半监督Laplacian支持向量机[J].电子学报,2012,40(7):1410-1415. 被引量：14
3Guangjian Yan,Hailan Jiang,Jinghui Luo,Xihan Mu,Fan Li,Jianbo Qi,Ronghai Hu,Donghui Xie,Guoqing Zhou.Quantitative Evaluation of Leaf Inclination Angle Distribution on Leaf Area Index Retrieval of Coniferous Canopies[J].Journal of Remote Sensing,2021(1):1-15. 被引量：4

引证文献1

1Bo Zhu,Xiaona Jing,Lan Qiu,Runbo Li.An Imbalanced Data Classification Method Based on Hybrid Resampling and Fine Cost Sensitive Support Vector Machine[J].Computers, Materials & Continua,2024,79(6):3977-3999. 被引量：1

二级引证文献1

1Congcong Ma,Jiaqi Mi,Wanlin Gao,Sha Tao.Cost-Sensitive Dual-Stream Residual Networks for Imbalanced Classification[J].Computers, Materials & Continua,2024,80(9):4243-4261.

1Debabrata Datta.Application of Bootstrap in Dose Apportionment of Nuclear Plants Via Uncertainty Modeling of the Effluent Released from Plants[J].World Journal of Nuclear Science and Technology,2012,2(1):41-47.
2Dezun Zhao,Jianyong Li,Weidong Cheng,Zhiyang He.Generalized Demodulation Transform for Bearing Fault Diagnosis Under Nonstationary Conditions and Gear Noise Interferences[J].Chinese Journal of Mechanical Engineering,2019,32(1):79-89. 被引量：2
3NI Xiaomei,WANG Huawei,LV Shaolan,XIONG Minglan.An Ensemble Classification Model Based on Imbalanced Data for Aviation Safety[J].Wuhan University Journal of Natural Sciences,2021,26(5):437-443.
4Cameron J Fairfield,Ewen M Harrison,Stephen J Wigmore.Duplicate publication bias weakens the validity of metaanalysis of immunosuppression after transplantation[J].World Journal of Gastroenterology,2017,23(39):7198-7200.
5Jiawei NIU,Zhunga LIU,Quan PAN,Yanbo YANG,Yang LI.Conditional self-attention generative adversarial network with differential evolution algorithm for imbalanced data classification[J].Chinese Journal of Aeronautics,2023,36(3):303-315.
6曹陶,许日聪,徐艺,万启军.IgA肾病患者血清C3与肾功能进展的相关性分析[J].中华肾脏病杂志,2021,37(12):974-979. 被引量：2
7李豇粼,张静骁,张凯博,白梅娟,侯帅.基于模拟退火与BEHHO算法的火力分配方案优化[J].电脑知识与技术,2023,19(7):21-23. 被引量：1
8SHI Peibei,WANG Zhong.An Ensemble Tree Classifier for Highly Imbalanced Data Classification[J].Journal of Systems Science & Complexity,2021,34(6):2250-2266.
9Yaorong Cheng,Yijun Li.Integrated optimization of multiproduct multiperiod transportation and inventory under a carbon cap constraint for online retailers[J].Transportation Safety and Environment,2021,3(3):291-303.
10Shan LIN,Hong ZHENG,Chao HAN,Bei HAN,Wei LI.Evaluation and prediction of slope stability using machine learning approaches[J].Frontiers of Structural and Civil Engineering,2021,15(4):821-833. 被引量：6

Computers, Materials & Continua

2023年第4期

浏览历史

内容加载中请稍等...

Imbalanced Data Classification Using SVM Based on Improved Simulated Annealing Featuring Synthetic Data Generation and Reduction 被引量：1

同被引文献3

引证文献1

二级引证文献1

相关作者

相关机构

相关主题

浏览历史