摘要
财务困境预测中数据不平衡会导致模型预测效果较差。针对此问题,组合混合采样算法、粒子群算法和随机森林构建新的财务预警模型。基于K-means对SMOTE算法改进,与随机欠采样结合形成新混合采样,将其与粒子群算法优化的随机森林组合构建财务预警模型,引进新指标评价模型稳定性,探索样本不同数量分布对分类结果的影响。基于2018-2019年中小板上市公司数据分析,在新旧混合采样与Bagging、SVM、BP神经网络和随机森林的组合中,新混合采样和随机森林的组合在AUC等7类评价指标上最优,大部分指标达90%以上。参数优化后,组合模型在AUC、G-mean和新指标SR上进一步提升8%。
Aiming at the problem that the imbalance of data in financial distress forecasting leads to the poor prediction effect of the model,this paper combines hybrid sampling algorithm,particle swarm algorithm and random forest to construct a new financial early warning model.First,based on K means to improve the SMOTE algorithm,combined with random under sampling,a new hybrid sampling is formed;then,it is combined with the random forest optimized by particle swarm algorithm to build a financial early warning model.Finally,new indices are introduced to evaluate the stability of the model,and the influence of the distribution of different numbers of samples on the classification results is explored.Based on the 2018-2019 small and medium board listed company data analysis,in the combination of old and new mixed sampling and different models,the combination of new mixed sampling and random forest is the best in the seven types of evaluation indicators such as AUC.After parameter optimization,the combination model's AUC,G-mean and new index SR have been improved.
作者
郑列
鲍佳
ZHENG Lie;BAO Jia(School of Science,Hubei Univ.of Tech.,Wuhan 430068,China)
出处
《湖北工业大学学报》
2022年第2期110-115,共6页
Journal of Hubei University of Technology