期刊文献+

基于Stacking集成学习模型的气态亚硝酸预测 被引量:7

Prediction of gaseous nitrous acid based on Stacking ensemble learning model
下载PDF
导出
摘要 建立了基于Stacking集成学习下气态亚硝酸(HONO)预测模型.利用非相干宽带腔增强吸收光谱(IBBCEAS)系统获得的北京城区HONO的浓度,结合HONO的来源,选取了O3、CO、SO2、NO、NO2、NOy、温度(T)、相对湿度(RH)、风速(WS)、j(HONO)、j(NO2)、j(O1D)作为特征数据,通过对HONO的平均日变化分析,将测量时间按小时转换为新特征.分别以极端梯度提升(XGBoost)、轻量化梯度促进机(LightGBM)以及随机森林(RF)算法构建基模型,采用5折交叉验证的方式划分训练集,将基模型输出的结果作为新特征集,并将新特征集作为第二层线性回归模型的输入,通过对这两层中的模型进行训练,最终得到Stacking集成学习HONO预测模型.通过对模型的特征重要度分析和计算夜间交通直接排放所占的贡献,表明CO是模型预测中重要的影响因子,说明机动车的直接排放是该区域冬季时期HONO的重要来源.利用测试集分别对单模型和融合后模型的预测性能进行评估,3个单模型的预测结果与测量值的相关系数都达到了0.91以上,其中Stacking融合后的模型性能最好,相关系数达到了0.94,平均绝对误差和均方根误差分别为0.307×10-9和0.453×10-9,结果表明基于Stacking集成学习方式下HONO预测模型的可解释性和推广性. A gaseous nitrous acid(HONO)prediction model based on Stacking ensemble learning was proposed.The concentrations of HONO in Beijing urban area were obtained using incoherent broadband cavity enhanced absorption spectroscopy(IBBCEAS).Combined with the HONO sources,O3,CO,SO2,NO,NO2,NOy,temperature(T),relative humidity(RH),wind speed(WS),j(HONO),j(NO2),j(O1D)were selected as characteristic data.By analyzing the average diurnal variation of HONO,the measurement time was converted into a new feature hour by hour.The base model was constructed by utilizing Extreme Gradient Boosting(XGBoost),Light Gradient Boosting Machine(LightGBM)and Random Forest(RF)algorithm.The training set was partitioned by 5-fold cross-validation method.The output of the base model was taken as a new feature set and as the input of second-level linear regression model.HONO prediction model was finally obtained via training the models in these two layers.Through the feature importance analysis and calculating the contribution of direct emission of vehicles at night,it showed that CO was an important impact factor in the prediction model,and that the direct emission of vehicles was a major source of HONO in the winter period at the region.The prediction performance of the base model and the Stacking ensemble model were evaluated by the test set respectively.The correlation coefficients between forecast results and measured values for the three base models were above 0.91.The performance of the Stacking ensemble model was the best,whose correlation coefficients reached 0.94.The average absolute error and root mean square error were 0.307×10-9 and 0.453×10-9,respectively.Explanability and applicability of the HONO prediction model based on Stacking ensemble learning.
作者 唐科 秦敏 赵星 段俊 方武 梁帅西 孟凡昊 叶凯迪 张鹤露 谢品华 TANG Ke;QIN Min;ZHAO Xing;DUAN Jun;FANG Wu;LIANG Shuai-xi;MENG Fan-hao;YE Kai-di;ZHANG He-lu;XIE Pin-hua(Key Laboratory of Environment Optics and Technology,Anhui Institute of Optics and Fine Mechanics,Chinese Academy of Sciences,Hefei 230031,China;University of Science and Technology of China,Hefei 230026,China;Center for Excellence in Regional Atmospheric Environment,Chinese Academy of Sciences,Xiamen 361021,China)
出处 《中国环境科学》 EI CAS CSCD 北大核心 2020年第2期582-590,共9页 China Environmental Science
基金 国家自然科学基金资助项目(41875154,91544104,4170050319) 中国科学院重点部署项目(KFZD-SW-320) 国家重点研发计划项目(2017YFC0209403) 中国科学院安徽光学精密机械研究所所长基金资助项目(AGHH201601)。
关键词 STACKING K折交叉验证 集成 气态亚硝酸 预测 stacking K-fold cross validation ensemble gaseous HONO prediction
  • 相关文献

参考文献3

二级参考文献31

  • 1Freijer J I, Van Eijkeren J C H, Van Eree L. A model for the effect on health of repeated exposure to ocone[ J]. Environmental Modelling and Software,2002,17(6) :553-562.
  • 2Abdul-Wahab S A, Al-Alawi S M. Assessment and prediction of tropospheric ozone concentration levels using artificial neural networks [ J ]. Environmental Modelling & Software, 2002,17 (3) :219-228.
  • 3Lefobn A S, Foley J K. Establishing relevant ozone standards to protect vegetation and human health: expose/dose response considerations [ J]. Air and Waste, 1993,43 ( 1 ) : 106-112.
  • 4Schlinka U, Dorlingb S, Pelikan E, et al. A rigorous inter- comparison of ground-level ozone predictions [ J ]. Atmospheric Environment ,2003,37 (23) :3237-3253.
  • 5Hock G, Schwartz J D, Groot B, et al. Effects of ambient particulate matter and ozone on daily mortality in Rotterdam[ J]. Archives of Environmental Health, 1997, 52 ( 6 ) :455-463.
  • 6Al-Alawi S M, Abdul-Wahab S A, Bakheit C S. Combining principal component regression and artiticial neural networks for more accurate predictions of grouad-level ozone [ J ]. Environmental Modelling & Software, 2008,23 (4) :396-403.
  • 7Lengyel A, Heberger K, Paksy L, et al. Prediction of ozone concentration in ambient air using multivariate methods [ J ]. Chemosphere, 2004,57 (8) : 889-896.
  • 8Elkamel A, Abdul-Wahab S, Bouhamra W, et al. Measurement and prediction of ozone levels around a heavily industrialized area: a neural network approach[J]. Advances in Environmental Research, 2001,5( 1 ) :47-59.
  • 9Krupa S, Nosal M, Ferdinand J A, et al. A muhi-variate statistical model integrating passive sampler and meteorology data to predict the frequency distributions of hourly ambient ozone ( O3 ) concentrations [ J ]. Environmental Pollution, 2003,124 (1) :173-178.
  • 10Barrero M A, Grimalt J O, Canton L. Prediction of daily ozone concentration maxima in the urban atmosphere [ J]. Chemometrics and Intelligent Laboratory Systems, 2006,80 ( 1 ) :67- 76.

共引文献60

同被引文献114

引证文献7

二级引证文献16

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部