摘要
中小企业贷款是国民经济活动的重要组成部分,在推动技术创新、促进经济增长和增加民众就业等方面有着不可忽视的作用.但是现有的商业银行授信评估标准主要是为大型企业设计的,很难匹配中小企业的融资需求.因此,为了满足商业银行的信贷标准,很多中小企业通过相互担保的方式来获得融资贷款.当越来越多的企业参与进来时,它们之间就形成了结构复杂的担保网络.这对国家金融安全来说是一把双刃剑.一方面,担保贷款有助于企业迅速筹集基金,加速发展;另一方面,在错综复杂的担保网络中,个体的信贷风险更容易导致系统性、行业性的违约发生.当前针对中小企业信贷风险的研究仍然停留在个体风险评估方面,缺少从总体担保网络的视角进行综合评价.因此,本文提出了大数据场景下的担保信贷风险评估方法,其中包括信贷风险传播和预测的概率图模型、处理不均衡样本的正加权k近邻方法、设计了其在海量数据情况下的分布式算法框架并且在Spark框架中进行了开发和实现.最后在真实的金融担保网络数据集上验证本文提出的方法的有效性,结果表明本文提出的方法在信贷违约预测任务中明显优于其他对比方法,在传播阶数为4的时候最为明显.在大数据平台上的性能实验结果表明:本文设计的分布式算法框架获得了5倍以上的性能提升,同时保持了算法性能的准线性扩展性.
As an essential part of the production,operation,and consumption,small and mediumsized enterprise(SME)loans play an important role in promoting economic growth,promoting innovation,increasing tax revenue,absorbing employment and improving people’s livelihood.However,the existing bank loan evaluation system lags behind the financial demands of small and medium-sized enterprises.Most of the standards are designed for large enterprises.In order to meet the financial security standards of banks,these small and medium-sized enterprises(SMEs)choose to guarantee each other in order to obtain loans.As more and more enterprises are involved,they form complex guarantee networks.It is a double-edged sword for the national economy.On the one hand,these secured loans help businesses to raise funds rapidly during periods of economic growth development.On the other hand,while complex networks can mitigate the risk of corporate defaults in times of economic downturn,they can also lead to large-scale defaults and spread infections.The conventional research on the credit risk of SMEs mainly focuses on the evaluation of individual enterprises,lack of risk assessment for the whole guarantee network default warning.Therefore,in this paper,we propose a risk assessment and prediction method for networked-guarantee loans.It includes a probability graph-based default diffusion and prediction model,distributed positive weighted k-nearest neighborhood classification for massive imbalance datasets.We then implement it in a distributed version on Spark framework.Finally,we evaluate the effectiveness of our proposed method on real-world datasets.The result shows it outperforms otherstate-of-the-art baselines,especially when the order is near 4.Moreover,we evaluate the computing performance on the realworld dataset,our methods achieve over 500%acceleration while keeping its near-linear scalability.
作者
程大伟
牛志彬
张丽清
CHENG Da-Wei;NIU Zhi-Bin;ZHANG Li-Qing(Department of Computer Science and Engineering,Shanghai Jiao Tong University,Shanghai 200240;College of Intelligence and Computing,Tianjin University,Tianjin 300354;Shanghai Research Center for Brain Science and Brain-inspired Intelligence,Shanghai 201308)
出处
《计算机学报》
EI
CSCD
北大核心
2020年第4期668-682,共15页
Chinese Journal of Computers
基金
国家重点研发计划(No.2018AAA0100704)
中国博士后科学基金
上海市科委重大基础研究项目(No.16JC1402800)资助.
关键词
担保网络
信贷风险
不均衡分类
分布式算法
风险传播
guarantee network
credit risk
imbalanced classification
distributed algorithm
risk diffusion