摘要
将随机森林应用到商业性养老保险购买行为预测过程中,对中国综合社会调查(CGSS)2017年问卷调查数据进行分析。首先运用SMOTE过采样来平衡数据集,其次采用网格搜索确认模型输入参数,最后将改进后的随机森林模型进行分类预测,并与支持向量机模型对比。实例结果表明,SMOTE过采样方法在处理非均衡数据方面表现良好,能够起到提高模型性能的效果,处理后的随机森林的分类效果优于支持向量机。
The applications of random forests is used to predict commercial endowment insurance purchasing behavior.China’s general social survey(CGSS)questionnaire survey data in 2017 is analyzed.SMOTE sampling is used to balance data set,then grid search is used to confirm mode input parameters.Finally the improved random forest model predictions is classified.And it is compared with support vector machine model.The results show that SMOTE oversampling method has a good performance in treating disequilibrium data,can improve the model performance,and the classification effect of stochastic forest after treatment is better than that of SVM.
作者
李强
陈衍姣
LI Qiang;CHEN Yanjiao(College of Big Data Applications and Economics,Guizhou University of Finance and Economics,Guizhou Province Big Data Statistical Analysis Key Laboratory,Guiyang 550025,China)
出处
《科技和产业》
2022年第8期271-275,共5页
Science Technology and Industry
基金
国家社会科学基金(18XTJ004)。
关键词
随机森林
客户识别
商业性养老保险
random forest
customer identification
commercial endowment insurance