摘要
随着社会经济的快速发展,信贷业务在金融领域中扮演着越来越重要的角色,利用机器学习算法进行信用评估成为了当前主流的方法。然而,目前仍存在一些问题亟待解决,如延迟标签带来的有标签数据不充分、模型滞后性的问题,以及动态信用评估模型缺乏可解释性的问题。针对这些问题,提出了一种面向延迟标签场景的可解释信用评估模型。该模型在动态模型树的基础上进行了加权改进,结合了延迟标签更新算法和自适应阈值的伪标签选择策略,将延迟标签数据看作反馈数据和伪标签数据两种状态分别进行处理,平衡了有标签数据不充分和模型滞后带来的影响,并实现了模型的可解释性。最后,在一些合成和真实的信用评估数据集上对模型进行了实验,与其他主流的算法相比,其更好地权衡了预测性能和可解释性。
With the rapid development of social economy,credit business plays an increasingly important role in the financial field,and using machine learning algorithms for credit evaluation has become the mainstream method.However,there are still some problems to be solved,such as the inadequacy of labeled data and model lag caused by delayed labels,and the lack of interpretability in dynamic credit evaluation models.To address these problems,this paper proposes an interpretable credit evaluation model for delayed label scenarios.Built upon the foundation of dynamic model trees,the model incorporates weighted enhancements.It combines delayed label update algorithms and a pseudo-label selection strategy with adaptive thresholds,treating delayed label data as both feedback data and pseudo-label data,effectively mitigating the impacts of insufficient labeled data and model lag.Moreover,the model achieves interpretability.It is finally tested on some synthetic and real credit evaluation datasets,demonstrating superior balance between predictive performance and interpretability compared to other mainstream algorithms.
作者
辛博
丁志军
XIN Bo;DING Zhijun(Key Laboratory of Embedded System and Service Computing of Ministry of Education(Tongji University),Shanghai 201804,China;Shanghai Network Finance Security Collaborative Innovation Center(Tongji University),Shanghai 201804,China)
出处
《计算机科学》
CSCD
北大核心
2024年第8期45-55,共11页
Computer Science
关键词
信用评估
延迟标签
可解释性
动态模型树
伪标签选择
Credit evaluation
Delayed label
Interpretability
Dynamic model tree
Pseudo-label selection