摘要
【目的】结合实际的中国网贷数据,通过对不同流行集成方法的对比分析,探索合适中国网贷信用风险监测的集成方法,从而提高对中国网贷平台信用风险的监测效率。【方法】基于人人贷交易数据,从借款人的5个方面提取特征信息并运用随机森林算法进行特征筛选,基于此运用4种集成算法和5种基分类器,构建信用风险预警模型实现对比分析。【结果】实验结果表明, Rotation Forest的准确度最高为99.32%,误差率仅为1.71%。而且基于随机森林的特征选择过程能够提高相关模型的性能。【局限】实验数据集有待进一步扩充。【结论】Rotation Forest集成模型与识别风险的重要因素结合,可以显著提高信用风险预测效率。
[Objective] This paper examines several popular ensemble-learning methods with real-world data, aiming to find the most suitable way to monitor the P2P credit risks facing China. [Methods] We extracted the borrower's features from five aspects, and identified the most remarkable ones with Random Forest method. Then, we compared the prediction models based on four ensemble-learning methods and five base classifiers. [Results] We found that the Rotation Forest method had the highest accuracy rate of 99.32% and the lowest error rate of t.71%. Feature selection processing based on Random Forest could improve the performance of all related models significantly. [Limitations] The sample dataset needs to be expanded. [Conclusions] The proposed method could identify credit risks more effectively.
作者
操玮
李灿
贺婷婷
朱卫东
Cao Wei;Li Can;He Tingting;Zhu Weidong(School of Economics,Hefei University of Technology,Hefei 230601,China)
出处
《数据分析与知识发现》
CSSCI
CSCD
北大核心
2018年第10期65-76,共12页
Data Analysis and Knowledge Discovery
基金
国家自然科学基金项目"基于多维证据理论的科学基金立项评估信息融合研究"(项目编号:71774047)的研究成果之一