摘要
信用评级建模中,当多个分类变量的类别较多时会给模型的估计和预测造成较大影响,因此需要对多类别的分类变量进行预处理。结合连续数据离散化的方法,提出基于Fisher精确检验、CACM准则和ACACM准则的有监督类合并方法。分别采用模拟数据和真实的小微企业信贷业务数据对所提出的方法进行分析,结果表明,对多类别的分类变量进行有效的类合并处理不仅有利于简化模型参数,而且有利于提高信用评级模型的分类效果。
The accuracy of parameter estimation and prediction is greatly influenced by multi-class categorical variable in credit rating models,so that it is necessary to preprocess these variables.Based on the idea of discretization,supervised class merging methods are proposed for multi-class categorical variables,which are under Fisher exact test,CACM criterion and ACACM criterion.In the third section,the proposed methods are applied to both simulated data and real credit business data of small and micro enterprises to quantify and compare their effect on the credit rating model.The results show that reasonable class merging for the multi-class categorical variable is beneficial,which not only simplifies the parameter estimation but also improves the predicting accuracy of credit rating model.
作者
刘赛可
何晓群
夏利宇
LIU Sai-ke;HE Xiao-qun;XIA Li-yu(Center for Applied Statistics,Renmin University of China,Beijing 100872,China;Management Consulting Institute,State Grid Energy Research Institute,Beijing 102209,China)
出处
《统计与信息论坛》
CSSCI
北大核心
2020年第7期3-8,共6页
Journal of Statistics and Information
基金
教育部人文社会科学重点研究基地重大项目“企业信用评级的统计模型研究与应用评价”(15JJD910002)
国家社会科学基金项目“个人信用评级的统计建模研究与应用”(13BTJ004)。
关键词
分类自变量
有监督的类合并
信用评级
数据预处理
multi-class categorical variable
supervised class merging
credit rating
data pretreatment