期刊文献+

基于机器学习的信用卡逾期预测研究

Machine Learning-based Prediction Models for Credit Card Overdue
下载PDF
导出
摘要 银行等信贷机构希望利用客户的信用卡数据构建模型对目标客户的逾期行为进行预测,将预测为“未逾期”的客户作为重点发展客户。针对传统机器学习模型预测为“未逾期”客户的可信度不高问题,文章构建了基于PR曲线的随机森林模型。在数据预处理时,使用独热编码对类别数据进行量化处理,并使用SMOTE方法对样本数据作平衡化处理。然后,基于PR曲线选择最优特征数以及使得分最大的最佳阈值0.182,构建随机森林模型,并通过网格搜索法进行超参数调优。实证结果表明:文章所提出模型的召回率为0.854、可信度为0.918,相对传统机器学习模型的预测效果有显著提升,更有利于银行对客户进行批量评估以及筛选优质客户。 Banks and other credit institutions aim to utilize customers'credit card data to develop a model for predicting the overdue behavior of target customers.The focus is on identifying customers who are likely to be'not overdue'.To address the issue of traditional machine learning models being unreliable in predicting'overdue'customers,this study presents a random forest model based on the PR curve.The class data were quantified using unique heat coding during data preprocessing,and the sample data were balanced using the SMOTE method.The optimal number of features and a threshold of 0.182,selected based on the PR curve,maximize the score and are used to construct the random forest model.The hyperparameters are optimized using the grid search method.Empirical results demonstrate that the proposed model achieves a recall rate of 0.854 and a reliability of 0.918.The prediction performance is significantly improved compared to traditional machine learning models,making it more advantageous for banks to evaluate customers in batches and identify high-quality customers.
作者 卢荣伟 黄嫦娥 谢久暉 Lu Rongwei;Huang Change;Xie Jiuhui(School of Mathematics and Computational Science,Guilin University of Electronic Technology,Guilin,China)
出处 《科学技术创新》 2024年第6期130-133,共4页 Scientific and Technological Innovation
关键词 逾期预测 机器学习 PR曲线 随机森林 overdue forecast machine learning PR Curve random forest
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部