期刊文献+

信用评级中多类别分类自变量的类合并方法研究 被引量:2

Study on Class Merging Method for Multi-class Categorical Variable in Credit Rating Modeling
下载PDF
导出
摘要 信用评级建模中,当多个分类变量的类别较多时会给模型的估计和预测造成较大影响,因此需要对多类别的分类变量进行预处理。结合连续数据离散化的方法,提出基于Fisher精确检验、CACM准则和ACACM准则的有监督类合并方法。分别采用模拟数据和真实的小微企业信贷业务数据对所提出的方法进行分析,结果表明,对多类别的分类变量进行有效的类合并处理不仅有利于简化模型参数,而且有利于提高信用评级模型的分类效果。 The accuracy of parameter estimation and prediction is greatly influenced by multi-class categorical variable in credit rating models,so that it is necessary to preprocess these variables.Based on the idea of discretization,supervised class merging methods are proposed for multi-class categorical variables,which are under Fisher exact test,CACM criterion and ACACM criterion.In the third section,the proposed methods are applied to both simulated data and real credit business data of small and micro enterprises to quantify and compare their effect on the credit rating model.The results show that reasonable class merging for the multi-class categorical variable is beneficial,which not only simplifies the parameter estimation but also improves the predicting accuracy of credit rating model.
作者 刘赛可 何晓群 夏利宇 LIU Sai-ke;HE Xiao-qun;XIA Li-yu(Center for Applied Statistics,Renmin University of China,Beijing 100872,China;Management Consulting Institute,State Grid Energy Research Institute,Beijing 102209,China)
出处 《统计与信息论坛》 CSSCI 北大核心 2020年第7期3-8,共6页 Journal of Statistics and Information
基金 教育部人文社会科学重点研究基地重大项目“企业信用评级的统计模型研究与应用评价”(15JJD910002) 国家社会科学基金项目“个人信用评级的统计建模研究与应用”(13BTJ004)。
关键词 分类自变量 有监督的类合并 信用评级 数据预处理 multi-class categorical variable supervised class merging credit rating data pretreatment
  • 相关文献

参考文献4

二级参考文献41

  • 1林毅夫,孙希芳.信息、非正规金融与中小企业融资[J].经济研究,2005,40(7):35-44. 被引量:1054
  • 2油永华.企业信用状况的定性评价——基于logistic回归模型的分析[J].统计与信息论坛,2006,21(6):85-88. 被引量:8
  • 3QUINLAN J R. C4. 5: Programs for Machine Learning [M]. San Mateo:Morgan Kaufmann, 1993.
  • 4MICHALSKI R S, MOZETIC I, HONG Ja-rong, et al. The multi-purpose incremental learning system AQ15 and its testing application to three medical domains [C] // Proceedings of Fifth National Conference on Artificial Intelligence. Pennsylvania: AAAI Press, 1986 : 1041-1045.
  • 5DOUGHERTY J, KOHAVI R, SAHAMI M.Supervised and unsupervised discretization ofcontinuous feature [C] // Proceedings of 12thInternational Conference of Machine Learning. SanMateo : Morgan Kaufmann, 1995 : 194-202.
  • 6FAYYAD U, IRANI K. Multi-interval discretization of continuous-valued attributes for classification learning [C] // Proceedings of Thirteenth International Joint Conference on Artificial Intelligence. San Mateo: Morgan Kaufmann, 1993: 1022-1027.
  • 7TSAI C J, LEE C I, YANG W P. A discretization algorithm based on class-attributes contingency coefficient [J]. Information Sciences, 2008, 178(17) : 714-731.
  • 8KERBER R. ChiMerge.. discretization of numericattributes [C]// Proceedings of Ninth NationalConference on Artificial Intelligence. San Jose:AAAI Press, 1992 : 123-128.
  • 9LIU H, SETIONO R. Feature selection via diseretization [J]. IEEE Transactions on Knowledge and Data Engineering, 1997, 9(4) :642-645.
  • 10TAY E H, SHEN L. A modified Chi2 algorithm for discretization [J]. IEEE Transactions on Knowledge and Data Engineering, 2002, 14(3) :666-670.

共引文献22

同被引文献5

引证文献2

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部