期刊文献+

信用评分模型中拒绝推断问题研究:基于半监督协同训练法的改进 被引量:3

Research on Reject Inference in Credit Scoring Model: Based on the Improvement of Semi-Supervised Co-Training Method
下载PDF
导出
摘要 随着我国金融市场的蓬勃发展,信用评价中的拒绝推断问题越来越受到重视。针对信用评分模型中存在的有类别标签的样本占比低,并且样本中的类别分布不平衡等问题,本文在半监督学习技术与集成学习理论的基础上,提出了一种新的算法——BCT算法。该算法通过使用动态Bagging生成多个子分类器,引入分类阈值参数来解决样本类别分布不平衡问题,以及设定早停止条件来避免算法迭代过程中存在的过拟合风险,以此对传统半监督协同训练法进行改进。通过在5个真实数据集上的实证分析发现,在不同数据集与不同拒绝比例下,BCT算法的性能均优于其他6种有监督学习和半监督学习算法的信用评分模型,显示了BCT算法具有良好的模型泛化性能和更高的模型评价能力。 With the vigorous development of financial market in China,the problem of reject inference in credit evaluation has been getting more and more attention.Aiming at the problem of low proportion of accepted samples and unbalanced distribution of sample categories existing in credit evaluation,this paper proposes a new algorithm,namely BCT(Bagging Co-Training with Optimized Threshold)algorithm,based on the semi-supervised learning technology and multi-classifier integration theory.The algorithm improves the traditional semi-supervised co-training method by using dynamic Bagging,introducing classification threshold parameters and setting early stop conditions.Through the empirical analysis on five real data sets,the BCT algorithm outperforms the other six supervised learning and semi-supervised learning algorithms in credit scoring models under different data sets and different rejection ratios and proves better performance in extented modeling and modeling evaluation.
作者 黎春 周振宇 Li Chun;Zhou Zhenyu
出处 《统计研究》 CSSCI 北大核心 2019年第9期82-92,共11页 Statistical Research
基金 国家社会科学基金一般项目“货币政策对企业财务非对称性传导效应研究”(16BGL059) 国家社会科学基金重大项目“大数据背景下我国新经济新动能统计监测与评价研究”(18DZA124) 国家社会科学基金重大项目“中国各地HDI指数的编制和研究”(16ZDA010) 国家自然科学基金青年项目“中国上市公司财务指数编制的理论、模型及其应用”(71102180) 西南财经大学项目“新时期宏观经济实时监测创新团队”(JBK190507) 西南财经大学项目“上市公司财务指数与宏观经济景气预测创新团队”(JBK190506)的资助
关键词 拒绝推断 信用评分 半监督协同训练 BCT算法 Reject Inference Credit Scoring Semi-Supervised Co-Training Method BCT Algorithm
  • 相关文献

参考文献3

二级参考文献57

  • 1Chapelle O, Scholkopf B, Zien A. Semi-supervised Learning [ M]. Cambridge: MIT Press,2006.
  • 2Zhu Xiao-Jin. Semi-supervised Learning with Graphs[D]. Carnegie Mellon University, doctoral thesis, 2005.
  • 3Blum A, Chawla S. Learning from labeled and unlabeled clam using graph mincuts[A]. Proceedings of the 18th International Conference on Machine Learning [ C]. Williamston, MA, 2001. 19 - 26.
  • 4Szummer M, Jaakkola T. Partially labeled classification with markov random walks [ A ]. Advances in Neural Information Processing Systems 14[ C]. Cambridge, MA: MIT Press, 2002. 945 - 952.
  • 5Joachims T. Transductive inference for text classification using support vector machines [ A]. Proceedings of the 16th International Conference on Machine Learning[ C]. New York, USA, 1999. 200 - 209.
  • 6Tong S, Koller D. Support vector machine active learning with applications to text classification[ A]. Proceedings of the 17th International Conference on Machine Learning [ C ]. Stanford, US,2000.999- 1006.
  • 7Nigam K, McCallum A K, Thrtm S, Mitchell T. Text classification from labeled and unlabeled documents using EM[J]. Machine Learning,2000,39(2 - 3) : 103 - 134.
  • 8Cozman F G, Cohen I, Cirelo M C. Semi-supervised learning of mixture model[ A]. Proceedings of the 20th International Conference on Machine Learning[ C ]. citeseer, 2003.99 - 106.
  • 9Blum A, Mitchell T. Combining labeled and unlabeled data with coqraining[ A] .Proceedings of the llth Annual Conference on Computational Learning Theory[ C ]. Madison, WI, 1998.92 - 100.
  • 10Dasgupta S, Litlman M, McAllester D. PAC generalization bounds for co-training [ A ]. Advances in Neural Information Processing Systems 14 [ C ]. Cambridge, MA, MIT Press, 2002.375 - 382.

共引文献28

同被引文献35

引证文献3

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部