摘要
本文首先利用逻辑回归、线性判别分析、K近邻、支持向量机、贝叶斯判别和决策树模型给出了客户流失预测模型,定量分析了客户是否会流失。然后对用于客户流失预测的分类器进行十折交叉验证,选取了精度最高的线性判别分析模型。其输出结果只有流失或者未流失,不能得到客户流失的概率,我们进一步将线性判别分析模型的输出结果转化成了概率,即概率校准。最后利用Brier分数和概率校准曲线等评价标准对校准前后的模型进行了评估,得到了更好的结果。
In this paper, we first used logistic regression, linear discriminant analysis, k-nearest neighbor, support vector machine, Bayesian discriminant and decision tree model to give a customer churn prediction model and quantitative analysis of whether the customer will be lost. Then, through the ten-fold cross validation of the six models, we selected the linear discriminant analysis model with the highest accuracy. The output result of the linear discriminant analysis model is only loss or no loss, and we cannot get the probability of customer loss, so we further transformed the output result of the linear discriminant analysis model into probability, namely probability calibration. Finally, the Brier score and probability calibration curve were used to evaluate the model before and after calibration, and better results were obtained.
出处
《统计学与应用》
2021年第4期634-641,共8页
Statistical and Application