摘要
在大数据背景下,信用机构拥有越来越多维度的贷款人数据,高维数据给构建信用风险评价模型带来了诸多难题。传统意义的信用风险评价模型日渐失效。利用中国银联信用数据作为研究样本,基于Lasso-RF两阶段特征选择,选取逻辑回归、支持向量机、随机森林、决策树等常用的信用评估分类算法,分别从准确率、精确率、召回率和F1值4个指标检验两阶段特征选择的有效性。实验结果表明:基于Lasso-RF两阶段特征选择方法较原始数据集在分类器的4个性能指标上均有所提升,证明了两阶段特征选择方法在个人信用风险评估上具有更好的分类效果。
In the context of big data,credit institutions have more and more dimensional lender data.High-dimensional data has brought many difficulties to the construction of credit risk evaluation model.The traditional credit risk evaluation model is becoming increasingly ineffective.Using China UnionPay credit data as the research sample,based on lasso-RF two-stage feature selection,common credit evaluation classification algorithms such as logistic regression,support vector machine,random forest and decision tree are selected to test the effectiveness of two-stage feature selection from four indicators:Accuracy,Precision,Recall and F1-score.The experimental results show that the two-stage feature selection method based on lasso-RF improves the four performance indexes of the classifier compared with the original data set,which proves that the two-stage feature selection method has better classification effect in personal credit risk assessment.
出处
《价格理论与实践》
北大核心
2021年第10期89-92,194,共5页
Price:Theory & Practice
基金
黑龙江省哲学社会科学基金项目(项目编号:20JYB031)的研究成果之一。
关键词
信用评估
两阶段特征选择
个人贷款
分类算法
credit evaluation
two-stage feature selection
personal loan
classification algorithm