期刊文献+

多种填补方法在纵向缺失数据中的比较研究 被引量:17

Comparative Study of Various Imputation Methods in Dealing with Longitudinal Missing Data
下载PDF
导出
摘要 目的比较多种方法对纵向缺失数据的处理效果。方法运用SAS软件通过蒙特卡罗模拟产生最常见的含一个分组因素和一个重复测量因素的纵向资料,对其进行混合效应模型分析,将结果作为标准对照。分别构建任意缺失模式(AMP)和单调缺失模式(MMP)下完全随机缺失(MCAR)、随机缺失(MAR)和非随机缺失(NMAR)六种缺失数据集,并使缺失率分别为10%、20%、30%、40%和50%,运用删除法、单一填补法、多重填补法和EM算法进行处理。结果在AMP下,当MCAR和MAR时,低缺失率(≤10%)下所有方法的效果均较好;随着缺失率的增大,只有多重填补法的效果令人满意。在MMP下,当MCAR和MAR时,只有线性回归法和预测均数匹配法的效果较好。多重填补法的缺点是在一定程度上高估系数的变异程度。另一方面,填补方法对结果的影响远超过填补次数对结果的影响。当NMAR时,所有方法都无法取得较好的处理效果。结论对于纵向缺失资料,多重填补法仍是一种较为理想的处理方法。 Objective To compare the effects of several commonly-used imputation methods in dealing with longitudi- nal missing data. Methods Simulate the longitudinal data with a classification factor and a repeated-measured factor using Monte Carlo simulation by SAS. Mixed effect model was used to analyze the effect of the longitudinal cohort. The result was used as standard control. Simulation datasets with MCAR,MAR and NMAR under AMP and MMP conditions were constructed, and the missing rate was set to be 10% ,20%, 30% ,40% and 50%, respectively. Deletion method, single imputation method, multiple imputation method and EM method were carried out. The results were then compared with the standard control. Results For MCAR and MAR datasets with AMP, all the methods showed satisfactory results when the rate of missing data remained modest ( ~〈 10% ). However, as the percentage increased,the multiple imputation method appeared to be the only optimal strate- gy. In contrast, for MCAR and MAR datasets with MMP, only the regression method and the predicted mean matching method were efficacious. It has to be noted that multiple imputation method tended to overestimate the variation of regression coeffi- cients. In addition,the imputation methodology played a far more important role than the number of iterations in analyzing the data. For NMAR datasets,all attempted methods were unable to achieve satisfactory results. Conclusion The multiple imputa- tion method was proved desirable in dealing with missing data in longitudinal cohort.
出处 《中国卫生统计》 CSCD 北大核心 2016年第1期45-48,共4页 Chinese Journal of Health Statistics
关键词 纵向缺失资料 缺失模式 缺失机制 多重填补 Longitudinal missing data Missing pattem Missing mechanism Multiple imputations
  • 相关文献

参考文献11

二级参考文献49

  • 1杨军,邹国华.比例Bootstrap及其方差估计的相合性[J].中国科学院研究生院学报,2007,24(3):273-279. 被引量:2
  • 2[1]Scheffe J. Dealing with missing data[J]. Res Lett Inf Math Sci,2002,3:153-156.
  • 3[3]Schafer JL, Olsen MK. Multiple imputation for multivariate missing-data problems: a data analyst's perspective[J]. Multivariate Behavioral Research,1998,33(4): 545-571.
  • 4[5]Darmawan I GN. NORM software review: handling missing values with multiple imputation methods[J]. Evaluat J Australastia, 2002,2(1): 51-57
  • 5[6]Bernards CA, Farmer MM, Qi K, et al. Comparison of two multiple imputation procedures in a cancer screening survey [J]. J Data Sci, 2003,1(1): 1-20.
  • 6RoderickJALittle,RubinDB著.孙山泽译.StatisticalAnalysiswithMissingData.2002.
  • 7Schafer JL. Analysis of Incomplete Multivariate Data. Chapman & Hall, London. 1997 : 144-145.
  • 8Schafer JL, Olsen MK. Multiple imputation for multivariate missing-data problems: a data analyst' s perspective. Multivariate Behavioral Research,33 ( 4 ) : 545-571.
  • 9MCMC method for arbitrary missing data. SAS/STAT 9 User' s guide. North Carolina: SAS Institute Inc ,2002 : 159-169.
  • 10Little R J A and Rubin D B. Statistical Analysis with Missing Data [M]. John Wiley and Sons, 2002(2nd Ed.).孙山泽译,缺失数据统计分析,中国统计出版社,2004.

共引文献74

同被引文献189

引证文献17

二级引证文献53

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部