摘要
随着P2P网络借贷交易量的增大,对P2P交易数据的挖掘和分析备受关注,其中一项重要的研究课题是网络借款成功率的影响因素分析.现有的文献多采用线性回归方法对该课题进行研究,但未考虑变量之间的多重共线性和采用最优变量子集建立回归模型的问题.本文采用Lasso回归方法,建立最优变量子集的回归模型对影响网络借款成功率的因素进行分析,避免了多重共线性问题对模型的干扰,同时提高了模型对数据的拟合精度.对Lending Club平台的借贷数据的实证分析结果显示,本文方法在模型的拟合精度和避免共线性方面优于对比方法.
The trading amount of P2 P network lending is rising, and the research of P2 P trading data receives much attention. The factor analysis of the success rate of network loan is one of the important research topics. The previous papers on this issue mainly adopt multi-linear regression method, ignoring the problem of multi-collinearity between the variables and the finding of "optimal" regression model. This paper uses the Lasso regression method to establish the regression model with optimal subset of variables, which can analyze the factors that affect the success rate of network borrowing, avoiding the multi-collinearity of the model interference and improving the prediction accuracy of the model.This paper empirically analyzes the borrowing and lending data from the Lending Club platform, and the result shows that our method is significantly superior to the compared approach in the aspects of fitting precision of the model and avoiding the multi-collinearity.
出处
《计算机系统应用》
2017年第7期204-209,共6页
Computer Systems & Applications
基金
国家自然科学基金(61672157)
福建师范大学网络与信息安全关键理论和技术创新团队(IRTL1207)
关键词
P2P网络借贷
Lasso回归
多重共线性
借款成功率
借款用途
P2P network lending
Lasso regression
multi-collinearity
the success rate borrowings
borrowings purposes