摘要
目的以全国血吸虫病疫情监测资料为数据来源,比较不同缺失值处理方法对模拟缺失值的处理结果 ,为确定适用于处理该资料缺失值的方法提供依据。方法从资料中分别随机抽取10%、20%、30%、40%、50%的观测值作为假设缺失值,采用均值填充法、热平台填充法和多重填充法对模拟缺失值进行填充,分别从分布特征、准确度和精确度三个方面评价缺失值的填充效果。结果不同假设缺失比例下,三种填充方法填充后的结果与原始值相比差异均无统计学意义。多重填充方法填充后结果精确度较好且分布特征与原始值符合度最好。结论多重填充技术较为适合处理该资料中缺失比例较少的缺失值。
Objective To compare the three imputation methods of missing values and provide scientific basis for the best imputation methods of missing values for the schistosomiasis surveillance data in China.Methods The mean,hot deck and multiple imputation techniques were used to impute the hypothesized missing values which were selected randomly from the schistosomiasis surveillance data with 10%,20%,30%,40% and 50%,respectively and the results of imputation were compared based on three aspects of distribution characteristic,accuracy and precision.Results There were no significant difference among the results of the three imputation methods and the original values.For the multiple-imputation method,it had better accurancy and distribution characteristic compared with other methods.Conclusion The multiple-imputation method was the best technique to handle with the missing values in the schistosomiasis surveillance data.
出处
《中国卫生统计》
CSCD
北大核心
2010年第2期125-128,共4页
Chinese Journal of Health Statistics
基金
国家自然科学基金重大项目资助(编号30590374)
复旦大学科创行动19期资助(编号C-19-02)
国家科技重大专项资助(2008ZX10004-011)
复旦大学重点学科创新人才培养计划基金资助项目
关键词
血吸虫病
疾病监测
缺失值
多重填充
Schistosomiasis
Disease surveillance
Missing value
Multiple imputation