摘要
目的对目前国内外常见的不同模型方法模拟的PM_(2.5)浓度数据集进行时空变化及模拟效能的比较。方法收集了2013—2020年国内外公开发表或共享的9套全国PM_(2.5)浓度模拟数据集。通过统计学分析和ArcGIS软件制图功能对9套PM_(2.5)浓度数据集的时空分布趋势进行对比。采用PyCharm软件对4套日值模型模拟的数据集进行回归评价分析。结果通过比对分析发现,不同模型在局部地区的模拟值高低、范围存在一定差异,但是各类模型模拟结果空间分布整体相似,呈现中东部高,西部偏低的空间趋势。除GBD数据集外,其余8套数据集的PM_(2.5)浓度总体呈现降低趋势,季节上呈现出冬季最高、春秋次之、夏季最低的季节规律。日值模型中随机森林模型模拟效能最佳,R^(2)为0.76,且具有较低的均方根误差(RMSE,21.96)。月值模型中时空—极端随机树模型模拟效能最佳,R^(2)为0.98,且具有较低的RMSE(3.26)。结论各个模型模拟得到的PM_(2.5)浓度时空分布相似。其中非线性机器学习模型的模拟效能优于大气化学模型和线性回归模型。未来可综合非线性和集成机器学习等模型的优点,采用集成模型来模拟PM_(2.5)浓度数据,进一步提高模型的时空分辨率和模拟效能。
Objective To compare the spatio-temporal variations and their simulation efficiency of PM_(2.5) concentration datasets simulated by different models.Methods Nine sets of national PM_(2.5) concentration simulation data that were published or shared by Chinese and international researchers from 2013 to 2020 were collected.The spatial and temporal distribution patterns of the nine datasets were compared by statistical analysis and ArcGIS mapping.PyCharm was used to conduct regression evaluation on four datasets simulated by daily-value models.Results The simulation result by different models showed different levels and rangs of simulation values in local areas,but had similar spatial distributions in general,which tended to be higher in the central and eastern parts and lower in the western regions.Except GBD dataset,PM_(2.5) concentrations of the other eight datasets all showed an overall decreasing trend,and showed the same seasonal trend,which was the highest in winter,followed by spring and autumn,and the lowest in summer.Among daily-value models,the random forest model demonstrated the best simulation performance(R2=0.76),with relatively low root mean square error(RMSE,21.96).Among monthly-value models,the space-time extremely randomized tree model showed the best simulation performance(R2=0.98),with relatively low RMSE(3.26).Conclusion The simulation datasets show similar spatio-temporal distributions of PM_(2.5) concentrations.Nonlinear machine learning models have superior simulation performance to atmospheric chemistry models and linear regression models.In the future,the advantages of nonlinear and ensemble machine learning models can be combined to simulate PM_(2.5) concentration data,which may further improve spatio-temporal resolution and simulation efficiency of the model.
作者
臧加伟
王情
高祥伟
许怀悦
ZANG Jia-wei;WANG Qing;GAO Xiang-wei;XU Huai-yue(School of Marine Technology and Geomatics,Jiangsu Ocean University,Lianyungang 222005,China;National Institute of Environmental Health,Chinese Center for Disease Control and Prevention/China CDC Key Laboratory of Environment and Population Health)
出处
《环境卫生学杂志》
2023年第1期20-29,共10页
JOURNAL OF ENVIRONMENTAL HYGIENE
基金
国家自然科学基金面上项目(42071433)。
关键词
多模型
细颗粒物(PM_(2.5))
时空趋势比对
模拟效能比对
multiple models
fine particulate matter(PM_(2.5))
spatio-temporal comparison
simulation efficiency comparison