摘要
探讨了基于Logistic回归和分类树的客户信用评估。从众多客户信用度影响因子中选择一些探索性变量,建立了相应的关联性测试模型。描述了Logistic回归和分类树的具体算法以及相关概念,如期望信息、信息增益等,然后分别使用Logistic回归模型和分类树对客户信用进行了测试评估。比较分析结果表明,分类树模型具有较低的错误分类率和较好的灵活性,但对计算资源的要求较高,且很大程度上依赖于观测数据。
The estimation of the customer trustworthiness is investigated based on the LRT( logistic regression technology) and the CT( classification tree). Firstly, some exploring variables are chosen from the factors that influence the trustworthiness of the customer and a corresponding test model of relation is built. The algorithm and the related concepts of LRT and CT are described, such as the expecting information, the information gain and so on. Then the logistic regression model and CT are used separately to estimate the trustworthiness of the customer. Results from comparison and analysis show that the CT model is convenient with a lower mistake classification rate. But it requires more calculating resources and depends on the observation data comparatively
出处
《江苏科技大学学报(自然科学版)》
CAS
北大核心
2007年第B12期63-69,共7页
Journal of Jiangsu University of Science and Technology:Natural Science Edition
基金
江苏省教育厅青蓝工程资助项目(2005DX028J)
关键词
数据挖掘
客户信用评估
LOGISTIC回归
分类树
data mining
trustworthiness estimation of customer
logistic regression
classification tree