摘要
目的用模拟研究的方法,对含周期性的时间序列数据中的连续型缺失数据进行填补,比较基于周期信息的时间序列缺失值填补法(简称周期性填补法)和spline插值填补法对连续型缺失数据的填补效果。方法分别应用模拟时间序列数据和实际时间序列数据模拟连续型缺失,比较两种方法在不同连续缺失个数下的缺失值填补效果。采用NRMSE和RMSE量化填补的误差。结果除连续型缺失长度为10和平,随着连续缺失个数的增加,周期性填补法的填补误均小于spline插值填补法。周期性填补方法的填补误差在5~30的连续缺失范围内无明显波动,始终保持在一个较低的水平;而spline填补值的误差随着缺失个数的增加明显增高。结论对于含有确切周期性的时间序列,周期性填补方法对连续型缺失数据的填补效果相对于spline填补更好,填补误差稳定,并且不随连续缺失长度的增加而有较大的变化。
Objective To compare the imputed effects for con- tinuously missing values between imputation method based on periodicity and the spline curve method using simulating time-series data with significant periodicity. Methods To apply simulated time-series and actual time-se- ries to simulate continuously missing values, then the imputation effects of two methods were compared by NRMSE and RMSE. Results Except for 10 and 20 missing length, the error of imputation method based on periodic- ity was smaller than that of spline method, and the imputation error always stayed a lower level with increasing length of continuously missing values, while the imputation error of spline method increased with the length of continuously missing values. Conclusion The imputation method based on periodicity is good at continuously missing pattern and provides accuracy and stable imputed values in time-series with significant periodicity.
出处
《中国卫生统计》
CSCD
北大核心
2012年第3期318-320,324,共4页
Chinese Journal of Health Statistics
基金
2008年国家自然科学基金资助(30872182)