摘要
保险欺诈不仅危及保险公司的正常经营,增加投保人的负担,甚至有可能影响到国家的金融稳定。随着大数据时代的到来,保险反欺诈亟需引入革命性技术。Bagging集成方法以其可调节模型结构、易于部署、参数空间可控、支持并行运算等特点成为保险公司进行保险反欺诈一个好的选择。Bagging方法主要包括Bagging算法、Random Subspace算法、Random Patches算法,它们又能与不同基学习器结合构成新的分支算法及算法特例。本文基于这些算法对保险欺诈问题进行了实证检验,分析了各算法及与基学习器的适用性问题,以及基学习器个数对算法表现的影响。分析发现:针对保险欺诈识别问题,在Bagging、Random Subspace、Random Patches三者之中,Random Patches算法的表现最好,Bagging的运行时间最短;不同算法适用的基学习器不同,但总体来说最适合Bagging集成方法的是决策树;基于决策树的方法都一致选择是否委托律师代理作为最重要的特征;基学习器个数对不同Bagging算法表现的影响并不一致。
Insurance fraud not only jeopardizes the normal operation of insurance companies, but also increases the burden on policyholders,and may even affect Chinas financial stability. With the advent of the era of big data,it is necessary to introduce revolutionary technology for insurance fraud detection. The Bagging ensemble method has become an optimal choice because it's easy to adjust the model structure according to the amount of data,easy to deploy, controllable parameter space,and support for parallel computing. The Bagging methodology mainly comprises Bagging algorithm, Random Subspace algorithm, and Random Patches algorithm, and they can be combined with other base learners to form new branch algorithms and algorithm examples. Based on these algorithms, the paper conducted empirical testing on insurance frauds, the applicability of various algorithms and base learners, and the impacts of the number of base learners on the performance of algorithms. It was found that,for insurance fraud detection, the Random Patches algorithm had the highest score and the Bagging had the shortest running time among the Bagging, Random Subspace and Random Patches. Different algorithm should apply different base learner, but in general, among various base learners,the best was decision tree for the Bagging ensemble method. The most important feature of the decision tree method was whether to entrust a lawyer. The number of base learners had different effects on the performance of different algorithms.
作者
李秀芳
黄志国
陈孝伟
LI Xiu-fang;HUANG Zhi-guo;CHEN Xiao-wei
出处
《保险研究》
CSSCI
北大核心
2019年第4期66-84,共19页
Insurance Studies
基金
国家自然科学基金面上项目“保险公司经济资本预测与最优配置问题研究”(NO.71573143)
“不确定全面风险分析框架下供应链风险建模与优化研究”(NO.61673225)
中央高校基本科研业务费专项资金“随机最优控制与金融保险管理交叉研究”(NO.63185019)的资助