期刊文献+

基于重抽样法处理不平衡问题的信用评级模型 被引量:8

Data Imbalance in Credit Score Model Based on Resampling Method
原文传递
导出
摘要 由于履约客户的数量远远大于违约客户,征信数据具备严重的不平衡特征,常用的处理方法较少同时考虑金融机构所关注的违约损失和市场份额因素。本文基于违约损失因素提出迭代重抽样集成模型(IRIM),利用迭代欠抽样方法提升模型对"坏"客户的关注,采用集成方法将弱分类模型转变为强分类模型;基于市场份额因素改进常用的F-value指标,引入评价分类效果的RS指标。在6类不平衡关系下进行模拟研究,并对SSBF数据和中国某银行征信数据进行实证研究。结果表明,与常用的方法和指标相比,迭代重抽样集成模型能够在确保市场份额不过度减少的情况下降低金融机构的违约风险,RS指标能够恰当地权衡市场份额和违约风险的关系。 The number of"good credit"customer is far greater than that of"bad credit"customer,thus credit data presents a seriousimbalance structure.However,common methods rarely focus on both default losses and market share,on which financial institutions puta high value.For the sake of default loss,we propose an Iterative Resampling Integration Model(IRIM)to improve model’s concern on"bad credit"customer by resampling method and transform the weak classifier to a strong one by model integration.Based on F-value in-dex,we propose a RS index for the sake of market share to evaluate classification effect.Simulation studies in 6 data imbalance cases areimplemented,empirical studies with SSBF dataset and bank of C dataset are conducted.The results demonstrate that our method can re-duce financial institutions’risk of default without excessively losing market share,and RS index can appropriately coordinate the rela-tionship between market share and default risk.
作者 夏利宇 何晓群 Xia Liyu;He Xiaoqun(State Grid Energy Research Institute,Bejjing 102209;School of Statistics,Renmin University of China,Beijing 100872)
出处 《管理评论》 CSSCI 北大核心 2020年第3期75-84,共10页 Management Review
基金 教育部人文社会科学重点研究基地重大项目(15JJD910002)。
关键词 信用评级模型 不平衡 迭代重抽样 评价指标 credit score model data imbalance iterative resampling evaluation index
  • 相关文献

参考文献5

二级参考文献57

  • 1韩慧,王文渊,毛炳寰.不均衡数据集中基于Adaboost的过抽样算法[J].计算机工程,2007,33(10):207-209. 被引量:13
  • 2Huang J, Charles X Ling. Using AUC and accuracy in evaluating learning algorithms[J]. IEEE Trans on Knowledge and Data Engineering, 2005, 17(3): 299-310.
  • 3Cohen G, Hilario M, Hugonet Sax H S, et al. Learning from imbalanced data in surveillance of nosocomial infection[J]. Artificial Intelligence in Medicine, 2006, 37(5): 7-18.
  • 4Burez J, Van den Poel D. Handling class imbalance in customer churn prediction[J]. Expert Systems with Applications, 2009, 36(3): 4626-4636.
  • 5Zhou Z H, Liu X Y. The influence of class imbalance on cost-sensitive learning: An empirical study[C]. Proc of the 6th IEEE Int Conf on Data Mining. Hong Kong: IEEE Press, 2006: 970-974.
  • 6Liu X Y, Wu J X, Zhou Z H. Exploratory under-sampling for class-imbalance learning[C]. Proc of the 6th IEEE Int Conf on Data Mining. Hong Kong: IEEE Press, 2006: 965-969.
  • 7Zhou Z H, Liu X Y. Training cost-sensitive neural networks with methods addressing the class imbalance problem[J]. IEEE Trans on Knowledge and Data Engineering, 2006, 18(1): 63-77.
  • 8Liu X Y, Wu J, Zhou Z H. Exploratory under-sampling for class-imbalance learning[J]. IEEE Trans on Systems, Man, and Cybernetics, Part B: Cybernetics, 2009, 39(2): 539- 550.
  • 9Chawla N V, Bowyer K W, Hall L O. SMOTE: Synthetic minority over-sampling technique[J]. J of Artificial Intelligence Research, 2002, 16(5): 321-357.
  • 10Han H, Wang W Y. Borderline-SMOTE: A new oversampling method in imbalanced data sets learnings[C]. Int Conf on Intelligent Computing. Hefei: IEEE Press, 2005, 3644: 878-887.

共引文献90

同被引文献95

引证文献8

二级引证文献51

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部