摘要
针对PM_(2.5)时间序列的非线性、高噪声、不平稳与波动性的特征,提出一种基于分解集成框架以及相关性去噪的新型XGBoost-ARIMA混合预测模型。以石家庄市为例,选择PM_(10)、SO_(2)等4个影响因子,PM_(2.5)为目标因子,构建混合预测模型以合理区分与处理时间序列中高频、低频数据,并通过Pearson相关性去噪方法对时间序列中的噪声因子进行去除。实例验证及与经典预测模型的对比研究表明,提出的新型XGBoost-ARIMA混合预测模型适用于大气污染治理以及环境政策制定所需的PM_(2.5)质量浓度日均数据预测,实现了针对大气污染物日均质量浓度的准确预测,能够为污染治理与政策制定提供科学的数据支撑;该方法与经典预测模型相比,具有更优的预测性能(平均绝对误差仅为10.46518,且希尔不等系数低至0.08589)。
Because of the nonlinear,unstable and fluctuating characteristics of the time series,a new XGBoost-ARIMA hybrid prediction model based on decomposition integration framework and correlation denoising is proposed in this paper.In this model,four de-noised multivariate factors(e.g.,PM_(10)and SO_(2))are selected as input features and PM2.5 as the target features.The high-frequency and low-frequency data in the time series are reasonably distinguished and processed by constructing a hybrid model.The hybrid model can give full play to the prediction advantages of the two different models.XGBoost neural network shows excellent prediction performance for the complex high-frequency part,while ARIMA linear prediction model is more accurate for the relatively stable low-frequency part.The main flow of the hybrid model is as follows.Firstly,the model decomposes time series and multivariate factor series using the CEEMDAN algorithm and obtains several modal components with different frequency characteristics.Secondly,the Pearson correlation denoising method is used to process multivariate factor sequences to reduce the negative impact of noise factors on prediction performance.Thirdly,the fine-tocoarse algorithm reasonably distinguishes the high-frequency and low-frequency parts of the time series,and the XGBoost neural network is used to predict the high-frequency part,the ARIMA model is used to predict the low-frequency part.Finally,the prediction values of each part are summarized to obtain the final prediction result.The hybrid model proposed in this paper accurately predicts the average daily concentration of air pollutants,providing scientific data support for pollution control and policy-making.Example verification and comparison with the classical prediction model show that the mean absolute error of the new XGBoost-ARIMA hybrid prediction model is only 10.46518,the root-mean-square-error is only 15.52313,and the Hill inequality coefficient is as low as 0.08589.Based on the root mean square error,the prediction performance of the hybrid prediction model improves by about 36.3%after correlation denoising,and compare with other classical prediction models,it has lower prediction error and higher prediction accuracy.In conclusion,the hybrid model proposed in this paper accurately predicts the average daily concentration of air pollutants,providing scientific data support for pollution control and policymaking.
作者
刘拥民
罗皓懿
谢铁强
LIU Yong-min;LUO Hao-yi;XIE Tie-qiang(Computer and Information Engineering College,Central South University of Forestry and Technology,Changsha 410004,China;Smart Forestry Cloud Research Center,Central South University of Forestry and Technology,Changsha 410004,China)
出处
《安全与环境学报》
CAS
CSCD
北大核心
2023年第1期211-221,共11页
Journal of Safety and Environment
基金
国家自然科学基金项目(31870532)
湖南省自然科学基金项目(2021JJ31163)
湖南省教育科学“十三五”规划基金项目(XJK20BGD048)。