期刊文献+

基于组合特征选择的随机森林信用评估 被引量:6

Random Forest Credit Evaluation Based on Combination Feature Selection
下载PDF
导出
摘要 构建个人信用风险评估模型的过程中,特征工程很大程度上决定了评估器的性能,传统的特征选择方法无法全面的考虑高维度指标对评估结果的影响,且大多数研究在构建模型的过程中人为决定特征集大小,导致随机性强、可信度低;基于此,提出基于传统风控指标优化XGBoost的随机森林模型(IV-XGBoostRF),将传统风控指标IV与XGBoost相结合对原始特征集进行筛选,建立较为完善的信用评估模型.通过对比实验的结果显示改进后的随机森林模型准确度提高了0.90%,且其他各项评估指标均优于传统信用评估模型,证明了该组合特征选择方法的可行性,有一定的应用价值. In the process of building a personal credit risk evaluation model,feature engineering largely determines the performance of the evaluator.Traditional feature selection methods cannot fully consider the impact of high-dimensional indicators on the evaluation results,and most studies artificially determines the size of the feature set in the process of building the model,leading to high randomness and low credibility.Therefore,a random forest model(IV-XGBoostRF)based on traditional risk control indicators to optimize XGBoost is proposed.The traditional risk control indicators IV and XGBoost are combined to screen the original feature set to build a relatively complete credit evaluation model.The results of comparison experiments show that the accuracy of the improved random forest model is increased by 0.90%,and other evaluation indicators are better than the traditional credit evaluation model,which proves the feasibility of the feature selection method and has certain application value.
作者 饶姗姗 冷小鹏 RAO Shan-Shan;LENG Xiao-Peng(School of Computer and Network Security(Oxford Brookes College),Chengdu University of Technology,Chengdu 610051,China)
出处 《计算机系统应用》 2022年第3期345-350,共6页 Computer Systems & Applications
基金 四川省科技厅应用基础研究项目(2021YJ0335)。
关键词 信用评估 信息价值 组合特征选择 随机森林 XGBoost credit evaluation information value combination feature selection random forest XGBoost
  • 相关文献

参考文献7

二级参考文献109

  • 1方匡南,吴见彬,朱建平,谢邦昌.信贷信息不对称下的信用卡信用风险研究[J].经济研究,2010,45(S1):97-107. 被引量:64
  • 2李萌.Logit模型在商业银行信用风险评估中的应用研究[J].管理科学,2005,18(2):33-38. 被引量:48
  • 3李凯,黄厚宽.小规模数据集的神经网络集成算法研究[J].计算机研究与发展,2006,43(7):1161-1166. 被引量:10
  • 4Ho, T. K. (1995). Random decision forests. In: Holmes, C.C., Adams, N.M. (eds.), Proceedings of the 3^rd International Conference on Document Analysis and Recognition, 278-282.
  • 5Holmes, C.C. & Adams, N.M. (2002). A probabilistic nearest neighbour method for statistical pattern recognition. Journal of Royal Statistical Society, Series B, 64: 295 -306.
  • 6Islam, M.J., Wu, Q.M.J., Ahmadi, M. & Sid-Ahmed, M.A. (2007). Investigating the performance of naive Bayes classifiers and k-nearest neighbor classifiers. In: International Conference on Convergence Information Technology, 1541-1546, November, 21-23, 2007.
  • 7Jensen, H.L. (1992). Using neural networks for credit scoring. Managerial Finance, 18: 15-26.
  • 8Khalik, A. & El-Sheshai, K.M. (1980). Information choice and utilization in an experiment of default prediction. Journal of Accounting Research, autumn: 325-342.
  • 9Kirk, E.E. (1982). Experimental Design (2^nd Ed.). Cole Publishing Company, Monterey, CA: Brooks.
  • 10Kittler, J., Hatef, M., Duin, R.P.W. & Matas, J. (1998). On combining classifiers. IEEE Transaction on Pattern Analysis and Machine Intelligence, 20 (3): 226-239.

共引文献102

同被引文献63

引证文献6

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部