摘要
有效识别贷款申请欺诈倾向是维护借贷双方利益的首要前提,是金融借贷市场一直以来关注的重点。随着文本挖掘技术的发展,贷款申请人提供的贷款描述,使其传达的信息受到更多关注。研究中利用贷款描述文本对欺诈行为进行识别,有助于拓宽非结构化文本数据在金融市场日常交易中的应用。利用深度学习模型Transformer对文本信息进行提取,再用自动编码器对文本信息进一步抽取,最终得到文本信息测度。基于17个指标构建基准机器学习模型,进一步加入文本信息测度作为新的预测变量。样本外预测结果显示,文本信息测度有助于提升模型拟合效果,在不同模型中提升精度介于0.68%-1.42%之间,表明结果具有稳健性;特征重要性结果也表明,文本信息测度在模型预测结果的贡献度中位于前4。验证了文本信息在欺诈识别中的作用。
Effective identification of fraudulent tendencies in loan applications is the primary prerequisite for safeguarding the interests of both borrowers and lenders, and has always been the focus of the financial lending market. With the development of text mining technology, the information conveyed by loan descriptions provided by loan applicants has received more attention. The use of loan description texts to identify fraudulent behaviors in the research helps to broaden the use of unstructured text data in daily transactions in the financial market applications. We use the deep learning model transformer to extract the text information, the autoencoder to further extract the text information, and finally get the text information measurement. A benchmark machine learning model is constructed based on 17 indicators, and text information measures are further added as new predictor variables. The prediction results show that the text information measure helps to improve the model fitting effect, and the improvement accuracy is between 0.68% and 1.42% in different models, indicating that the results are robust. The feature importance results also show that the text information measure is in the top 4. Empirical results validate the role of textual information in fraud detection.
作者
刘娟娟
梁龙跃
蔡铉烨
LIU Juanjuan;LIANG Longyue;CAI Xuanye(School of Economics,Guizhou University,Guiyang 550025,China;School of Statistics and Mathematics,Central University of Finance and Economics,Beijing 102206,China)
出处
《智能计算机与应用》
2022年第7期52-58,68,共8页
Intelligent Computer and Applications
基金
国家自然科学基金资助项目(52000045)
贵州大学研究生创新人才计划项目(CJ202169)。