摘要
目的比较逐步回归模型与人工神经网络模型在建立基于气象、环境因子的心脑血管疾病急诊数预测模型的建模效果。方法使用北京市2008—2011年12项气象逐日常规观测数据并据此生成5个二次气象因子,以及同期的3种污染物浓度数据,并构造3个年份哑变量,共23个变量形成模型的输入。统计同期北京市某医院4年逐日急诊就诊记录资料,并提取其中属于心脑血管疾病的条目构成模型输出训练数据。将所有数据进行周平均处理并以80%,15%和5%的比例分成学习集、测试集与验证集。分别构建逐步回归模型与人工神经网络模型,并使用独立样本(验证集)比较模型预报效果。结果逐步回归模型经过筛选最终有11个输入变量,而人工神经网络模型结构为23-22-1。在验证集的预报效果指标中,神经网络预报模型的平均绝对误差、误差均值、最小误差以及预测结果与原数据的Pearson相关系数分别为0.914 9,-0.003 3,0.01,0.873,均比逐步回归模型(分别为1.355 3,0.492 4,0.03,0.836)更为理想。结论与传统的统计学方法相比,人工神经网络建模方法在建立基于气象、环境因子的心脑血管疾病量预报系统方面更有优势。
Objective To compare the performance of stepwise regression and artificial neural model in establishing the cardiovascular disease prediction model base on meteorological and environmental factors. Methods A total of 23 input variables was made, including 12 input meteorological variables from daily routine observation of Beijing during 2008-2011 and five variables constructed from them, concentrations of three pollutants in the same period, as well as dummy variables for three years. Output variable of the model was the daily count of emergency admissions due to cardiovascular diseases, which extracted from the original daily records of a hospital in Beijing. All the data were averaged by week and formed 80% of the data as training set, while the remaining 15% was comprised of test set, while formed the other 5% as validation set. The stepwise regression models and the artificial neural network model were constructed respectively and the prediction performance of the models was compared by the independent samples (validation set). Results The stepwise regression process selected 11 variables as input, while the artificial neural network model structure was 23-22-1. The average absolute error, average error, minimum error and the Pearson Correlation of neural network prediction model was 0.914 9, -0.003 3, 0.01, 0.873, which were superior to stepwise regression model in forecasting performance metrics of the validation set. Conclusion Artificial neural network method has more advantages to traditional statistical method in establishing the prediction model of cardiovascular diseases based on meteorological and environmental factors.
出处
《环境与健康杂志》
CAS
CSCD
北大核心
2013年第1期48-52,共5页
Journal of Environment and Health
基金
国家自然科学基金青年基金(81101130)
国家科技基础平台课题(2005DKA32403)
关键词
BP神经网络
逐步回归
心脑血管疾病
气象因子
环境因子
Baek propagation neural network
Stepwise regression
Cardiovascular diseases
Meteorological taetors
Environmental factors