摘要
客户信用评估是银行等金融企业日常经营活动中的重要组成部分。一般违约样本在客户总体中只占少数,而能按时还款客户样本占多数,这就是客户信用评估中常见的类别不平衡问题。目前,用于客户信用评估的方法尚不能有效解决少数类样本稀缺带来的类别不平衡。本研究引入迁移学习技术整合系统内外部信息,以解决少数类样本稀缺带来的类别不平衡问题。为了提高对来自系统外部少数类样本信息的使用效率,构建了一种新的迁移学习模型:以基于集成技术的迁移装袋模型为基础,使用两阶段抽样和数据分组处理技术分别对其基模型生成和集成策略进行改进。运用重庆某商业银行信用卡客户数据进行的实证研究结果表明:与目前客户信用评估的常用方法相比,新模型能更好地处理绝对稀缺条件下类别不平衡对客户信用评估的影响,特别对占少数的违约客户有更好的预测精度。
Customer credit scoring is an important part of daily business activities for financial companies such as banks. Default customers usually makae up the minority of the population while customers of timely repayment make up the majority,which is called a class imbalance problem in the study of customer credit scoring. Existing methods in credit scoring cannot effectively solve the issue of class imbalance caused by absolute scarcity of the minority class. In our study,we introduce the technique of transfer learning to integrate the external information and try to solve the issue of class imbalance caused by absolute scarcity of the minority class. In order to exploit the minority sample outside the system more effectively,a transfer learning model is proposed,which is based on the ensemble transfer learning technology transfer bagging. A two-stage sampling method and the technique of group method of data handling are used in the new model to improve the generation and integration strategy of base models. The empirical results on the credit card dataset from a commercial bank show that the new model can deal with the issue of class imbalance caused by absolute scarcity better in comparison with other commonly used methods in credit scoring and provide a better prediction of the credit status of default customers.
出处
《运筹与管理》
CSSCI
CSCD
北大核心
2015年第2期201-207,共7页
Operations Research and Management Science
基金
国家自然科学基金资助项目(71401115)
教育部人文社会科学基金(13YJC630249)
中央高校基本科研业务专项基金(2012SCU11013)