期刊文献+

基于数据集成的随机森林算法 被引量:14

Random Forest Algorithm Based on Data Integration
下载PDF
导出
摘要 用于销售预测的历史数据存在稀疏性与波动性等特点,当预测周期较长时,传统统计学或者机器学习领域预测算法的预测效果较差。为此,利用随机森林的集成思想与训练数据集的随机分割重组,提出一种基于数据集成的随机森林算法。该算法通过随机重组将原始的一维预测变量重组为高维变量,并将输出求和值作为最终预测值。实验结果表明,与ARIMA、RF、GBDT等传统算法相比,该算法在实际数据集上的预测效果取得显著提高。同时,拓展实验表明数据集成还可应用在ARIMA算法上,使预测准确率提高约3%。 The historical data used for sales forecasting has the characteristics of sparseness and volatility,the traditional statistical or machine learning prediction algorithms for prediction perform poorly when the prediction cycle is long.Therefore,based on the integration idea of Random Forest(RF)and the random partition and reorganization of training data set,this paper proposes a RF algorithm based on data integration.The algorithm reconstructs the original one-dimensional prediction variable into high-dimensional variables by random recombination,and takes the output summation value as the final prediction value.The experimental results show that compared with traditional algorithms including ARIMA,RF and GBDT,the prediction performance of this algorithm on the actual data set has been significantly improved.At the same time,extended experiments show that the data integration can also be applied to ARIMA algorithm,and the prediction accuracy of the algorithm is improved by about 3%.
作者 谢坤 容钰添 胡奉平 陈桓 姚小龙 XIE Kun;RONG Yutian;HU Fengping;CHEN Huan;YAO Xiaolong(Research and Development Center of Big Data and Blockchain,SF Technology Co.,Ltd.,Shenzhen,Guangdong 518000,China)
出处 《计算机工程》 CAS CSCD 北大核心 2020年第12期290-298,共9页 Computer Engineering
基金 深圳市发展改革委战略性新兴产业发展专项“基于人工智能技术的智慧物流系统研发与产业化项目”。
关键词 销量预测 时间序列预测 机器学习 数据集成 随机森林 sales forecasting time series prediction machine learning data integration Random Forest(RF)
  • 相关文献

参考文献10

二级参考文献81

共引文献306

同被引文献144

引证文献14

二级引证文献36

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部