摘要
随着全球经济的发展,申请信用卡的人数激增,对于申请人的信用等级的评估显得尤为重要.从申请信用卡的大数据中选取出相对重要的特征变量,通过K-均值聚类方法对客户数据进行分类,分为多个类别.并建立了多值有序的Logistic回归模型.本文的全部输出结果均是在统计软件SAS 9.3环境下实现的,并且采用了SAS宏程序,实现大数据下银行信用卡申请人信用评级的批量数据处理和分析,同时也可将本文的方法推广到其他类似评级分类的大数据处理中.
With the development of economy in the world, the number of people applying for credit cards is increasing. So it is very important to determine the credit score. We chose the more important feature variables from the big data sets about the credit card applicant. Then we clustered the applicant customers with K-Means cluster method and divided into several categories. And we built a multivariate ordinal logistic regression model. All the results of our paper were realized in the environment of statistical software SAS 9.3. Then we used the SAS macro program to realize the data processing and analysis of the credit rating of big data in bank. Meanwhile, we can solve similar problem to deal with big data of cluster and rating.
出处
《吉林师范大学学报(自然科学版)》
2016年第3期72-81,共10页
Journal of Jilin Normal University:Natural Science Edition
基金
国家自然科学基金青年基金项目(11301037)
国家自然科学基金面上项目(11571051)
吉林省教育厅"十三五"规划项目(2016317)