A credit risk prediction model named KM-ADASYN-TL-FLLightGBM(KADT-FLightGBM)is proposed in this study.Firstly,to overcome the limitation of traditional sampling methods in dealing with imbalanced datasets,an improved ...A credit risk prediction model named KM-ADASYN-TL-FLLightGBM(KADT-FLightGBM)is proposed in this study.Firstly,to overcome the limitation of traditional sampling methods in dealing with imbalanced datasets,an improved ADASYN sampling with K-means clustering algorithm is constructed.Moreover,the Tomek Links method is used to filter the generated samples.Secondly,an utilized an optimized LightGBM algorithm with the Focal Loss is employed to training the model using the datasets obtained by the improved ADASYN sampling.Finally,the comparative analysis between the ensemble model and other different sampling methodologies is conducted on the Lending Club dataset.The results demonstrate that the proposed model effectively minimizes the misclassification of minority classes in credit risk prediction and can be used as a reference for similar studies.展开更多
基金supported by the National Natural Science Foundation of China(Nos.71503108 and 62077029)CCF-Huawei Innovation Research Program Grant(No.CCF-HuaweiFM202209)Research and Practice Innovation Project of Jiangsu Normal University(No.2022XKT1540).
文摘A credit risk prediction model named KM-ADASYN-TL-FLLightGBM(KADT-FLightGBM)is proposed in this study.Firstly,to overcome the limitation of traditional sampling methods in dealing with imbalanced datasets,an improved ADASYN sampling with K-means clustering algorithm is constructed.Moreover,the Tomek Links method is used to filter the generated samples.Secondly,an utilized an optimized LightGBM algorithm with the Focal Loss is employed to training the model using the datasets obtained by the improved ADASYN sampling.Finally,the comparative analysis between the ensemble model and other different sampling methodologies is conducted on the Lending Club dataset.The results demonstrate that the proposed model effectively minimizes the misclassification of minority classes in credit risk prediction and can be used as a reference for similar studies.