期刊文献+

ARIMA模型的建立及对中国肺结核月报告例数的预测效果研究 被引量:16

A study of prediction effect of autoregressive integrated moving average model on the monthly reported pulmonary tuberculosis cases in China
下载PDF
导出
摘要 目的建立自回归移动平均(autoregressive integrated moving average,ARIMA)模型,并对全国(不包括我国港澳台地区,下同)肺结核月报告患者例数进行预测效果研究,为肺结核防控措施的制定提供科学参考。方法通过中国疾病预防控制中心主办的《疾病监测》杂志公布的我国每月甲、乙、丙类传染病疫情动态简介,搜集2006年1月至2019年8月全国肺结核月报告患者例数。采用SPSS 26.0统计学软件,以2006年1月至2018年12月的全国肺结核月报告患者例数为基础建立时间序列,初步识别和定阶ARIMA模型类型;再以满足模型简洁、ARIMA模型各参数[包括自回归法(AR),平均移动法(MA),季节自回归法(SAR),季节移动平均法(SMA)]均有统计学意义(P值均<0.05),以及P>0.05的模型总体检验指标(Ljung-Box Q值)、最大平稳决定系数(R2)、最小整体模型的标准化贝叶斯信息准则值(NBIC)、最小均方根误差(RMSE)为标准筛选几种ARIMA模型;继而以2019年1—8月报告患者例数作为验证数据,参照预测值相对误差越小模型越优的原则筛选出最小相对误差的模型为最优模型;最后再以该模型预测我国2019年9月至2020年12月肺结核月报告患者例数。结果根据2006—2018年每年的全国肺结核月报告患者例数为基础建立时间序列,确定需拟合ARIMA(p,d,q)或ARIMA(p,d,q)×(P,D,Q)模型。以Ljung-Box Q值所对应的P值均>0.05、模型简洁、模型各参数均有统计学意义(P值均<0.05)筛选出12个基本模型,然后再以R2最大的模型[ARIMA(1,0,1)(0,1,1)12,R2=0.707]、RMSE最小的模型[ARIMA(0,1,2)(0,1,1)12,RMSE=9147.85]、NBIC最小的模型[ARIMA(0,1,1)(0,1,1)12,NBIC=18.355]、Ljung-Box Q值最小的模型[ARIMA(1,1,1)(0,1,1)12,Ljung-Box Q=8.797]作为备用模型,预测2019年1—8月中国肺结核月报告患者例数,并与实际的月报告患者例数进行比较,确定预测平均相对误差最小(0.55%)、MA(1)=0.875(t=19.243,P<0.001)、SMA(1)=0.876(t=7.596,P<0.001)、Ljung-Box Q=9.876(df=16,P=0.873)的ARIMA(0,1,1)(0,1,1)12模型为最优模型。再以该模型预测我国2019年9月至2020年12月肺结核月报告患者例数,其中2020年1—12月患者总计1025863例,平均每月85489例。结论ARIMA(0,1,1)(0,1,1)12模型对预测中国肺结核月报告患者例数方面效果较好,但应注意模型的建立和预测是个动态变化过程,需不断根据积累的数据进行调整,从而提高预测精度。 Objective An autoregressive integrated moving average(ARIMA)model was used to predict the monthly pulmonary tuberculosis cases in China(excluding Hong Kong,Macao and Taiwan regions)to provide a reference for pulmonary tuberculosis prevention and control.Methods Monthly pulmonary tuberculosis cases number in China from January 2006 to December 2018 reported on Disease Surveillance sponsored by CDC were collected.Based on these data,time series,preliminary identification and ordering of ARIMA model types were conducted using SPSS 26.0.Several ARIMA models were selected according to that both the simplicity of the model and the parameters of the ARIMA model(including autoregressive method(AR),average moving method(MA),seasonal autoregressive method(SAR),seasonal moving average method(SMA))were statistically significant(Ps<0.05),as well as the overall test index(Ljung-Box Q value),maximum stationary coefficient(R2)of the model,standardized Bayesian information criterion value(NBIC)of the smallest overall model,and minimum root mean square error(RMSE).Numbers of reported cases from January to August 2019 were used as verification,and the model with the smallest relative error was selected as the optimal model according to that the smaller the relative error of the predicted value,the better the model;finally,the model was used to predict monthly reported numbers of tuberculosis patients from September 2019 to December 2020 in China.Results Time series were based on cases from January 2006 to December 2018,the fitted model was ARIMA(p,d,q)or ARIMA(p,d,q)×(P,D,Q).Twelve models were selected according to P value(which is relative to Ljung-Box Q)>0.05,the simplicity of the model,and parameters of the model were statistically significant(all P<0.05);and models with the maximum R2(ARIMA(1,0,1)(0,1,1)12,R2=0.707)),or with the minimum RMSE(ARIMA(0,1,2)(0,1,1)12,RMSE=9147.85),or with the minimum NBIC(ARIMA(0,1,1)(0,1,1)12,NBIC=18.355)),or with the minimum Ljung-Box Q(ARIMA(1,1,1)(0,1,1)12,Ljung-Box Q=8.797))were taken as alternatve models,to predict numbers of reported cases from January to August 2019,which were then compared with the actual data,to determine the optimal ARIMA model(ARIMA(0,1,1)(0,1,1)12 model),with the relative error was the smallest(0.55%),MA(1)=0.875(t=19.243,P<0.001),SMA(1)=0.876(t=7.596,P<0.001),Ljung-Box Q=9.876(df=16,P=0.873).The ARIMA(0,1,1)(0,1,1)12 model was used to predict numbers of monthly reported tuberculosis cases in China from September 2019 to December 2020;in 2020 year,there will be1025863 cases totally with average of 85489 cases monthly.Conclusion ARIMA(0,1,1)(0,1,1)12 model is the better model to predict the monthly pulmonary tuberculosis cases in China.However,in order to improve accuracy of the prediction,the establishment and prediction of the model is a dynamic process needed to be adjusted continuously according to accumulated data.
作者 张顺先 邱磊 张少言 李翠 胡骏 田黎明 鹿振辉 ZHANG Shun-xian;QIU Lei;ZHANG Shao-yan;LI Cui;HU Jun;TIAN Li-ming;LU Zhen-hui(Respiratory Research Institute of Longhua Hospital,Shanghai University of Traditional Chinese Medicine,Shanghai 200032,China)
出处 《中国防痨杂志》 CAS CSCD 2020年第6期614-620,共7页 Chinese Journal of Antituberculosis
基金 “十三五”国家科技重大专项(2018ZX10725-509)。
关键词 结核 疾病报告 流行病学研究设计 模型 统计学 预测 Tuberculosis pulmonary Disease notification Epidemiologic research design Models statistical Forecasting
  • 相关文献

参考文献14

二级参考文献116

共引文献250

同被引文献192

引证文献16

二级引证文献60

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部