期刊文献+

基于遗传算法与随机森林的XGBoost改进方法研究 被引量:24

Study on XGBoost Improved Method Based on Genetic Algorithm and Random Forest
下载PDF
导出
摘要 回归预测是机器学习中重要的研究方向之一,有着广阔的应用领域。为了进一步提升回归预测的精度,提出了基于遗传算法与随机森林的XGBoost改进方法(GA_XGBoost_RF)。首先利用遗传算法(Genetic Algorithm,GA)良好的搜索能力和灵活性,以交叉验证平均得分为目标函数值,对XGBoost算法和随机森林算法(Random Forest,RF)的参数进行调优,选出较好的参数集,分别建立GA_XGBoost和GA_RF模型。然后对GA_XGBoost和GA_RF进行变权组合,利用训练集的预测值与真实值的均方误差为目标函数,使用遗传算法确定模型的权重。在UCI数据集上进行了实验,结果表明,与XGBoost,Random Forest,GA_XGBoost,GA_RF算法相比,在大部分数据集上GA_XGBoost_RF方法的均方误差、绝对误差和拟合度均优于单一模型,其中在拟合度方面所提方法在不同数据集上提高了约0.01%~2.1%,是一种有效的回归预测方法。 Regression prediction is one of the important research directions in machine learning and has a broad application field.In order to improve the accuracy of regression prediction,an improved XGBoost method(GA_XGBoost_RF)based on genetic algorithm and random forest is proposed.Firstly,with the good search ability and flexibility of Genetic Algorithm(GA),the XGBoost Algorithm and Random Forest Algorithm(RF)parameters are optimized with the average score of cross-validation as the objective function value,and the better parameter set is selected to establish GA_XGBoost and GA_RF models,respectively.Then the variable weight combination of GA_XGBoost and GA_RF is performed.The mean square error between the predicted value and the real value of the training set is used as the objective function,and the weight of the model is determined by genetic algorithm.On UCI data sets and the results show that the XGBoost and Random Forest,GA_XGBoost,GA_RF algorithm compared to GA_XGBoost_RF method in most of the data set is the fit of the mean square error(mse)and absolute error and are superior to single model,the proposed method on fitting on different data sets improves by about 0.01%~2.1%,is a kind of effective regression forecast method.
作者 王晓晖 张亮 李俊清 孙玉翠 田捷 韩睿毅 WANG Xiao-hui;ZHANG Liang;LI Jun-qing;SUN Yu-cui;TIAN Jie;HAN Rui-yi(School of Information Science and Engineering,Shandong Agricultural University,Taian,Shangdong 271018,China;Agricultural Big Data Research Center,Shandong Agricultural University,Taian,Shangdong 271018,China)
出处 《计算机科学》 CSCD 北大核心 2020年第S02期454-458,463,共6页 Computer Science
基金 大数据驱动下流域水库群联合防洪调度研究(2019GSF111043)。
关键词 回归预测 XGBoost 组合预测 随机森林 遗传算法 Regression prediction XGBoost Combination prediction Random forest Genetic algorithm
  • 相关文献

参考文献23

二级参考文献185

共引文献627

同被引文献234

引证文献24

二级引证文献83

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部