摘要
中国人保守的消费习惯使得信用评级建模数据违约率较低,数据呈现出不平衡的特点,这种不平衡性对logistic回归模型的预测效果带来负面影响。本文将非对称连接函数的思想引入到信用评级中,将有偏logistic分布的分布函数作为连接函数的反函数,利用实际数据来估计偏度参数和回归系数。研究表明,有偏logistic回归的预测效果优于普通logistic回归,并且在10%的违约数据集中,有偏logistic回归的表现还优于决策树、神经网络和支持向量机。
Default rate of credit data is low due to conservative consumption habits in China. The data sets always have the characteristic of imbalance.The feature brings logistic regression model negative influence. Asymmetrical connection function is introduced to the credit rating and the link function of logistic regression model is replaced by the skew-logistic distribution. In addition, skewness parameter and regression coefficients are estimated by real data.The results show that the prediction of skew-logistic model is better than the ordinary logistic regression. In 10% default rate of data set, the skew-logistic model performance is better than the decision tree, neural network and support vector machine.
出处
《数理统计与管理》
CSSCI
北大核心
2015年第6期1048-1056,共9页
Journal of Applied Statistics and Management
基金
国家社科基金项目(批准号:13BTJ004)阶段性成果