摘要
目的比较多种方法对纵向缺失数据的处理效果。方法运用SAS软件通过蒙特卡罗模拟产生最常见的含一个分组因素和一个重复测量因素的纵向资料,对其进行混合效应模型分析,将结果作为标准对照。分别构建任意缺失模式(AMP)和单调缺失模式(MMP)下完全随机缺失(MCAR)、随机缺失(MAR)和非随机缺失(NMAR)六种缺失数据集,并使缺失率分别为10%、20%、30%、40%和50%,运用删除法、单一填补法、多重填补法和EM算法进行处理。结果在AMP下,当MCAR和MAR时,低缺失率(≤10%)下所有方法的效果均较好;随着缺失率的增大,只有多重填补法的效果令人满意。在MMP下,当MCAR和MAR时,只有线性回归法和预测均数匹配法的效果较好。多重填补法的缺点是在一定程度上高估系数的变异程度。另一方面,填补方法对结果的影响远超过填补次数对结果的影响。当NMAR时,所有方法都无法取得较好的处理效果。结论对于纵向缺失资料,多重填补法仍是一种较为理想的处理方法。
Objective To compare the effects of several commonly-used imputation methods in dealing with longitudi- nal missing data. Methods Simulate the longitudinal data with a classification factor and a repeated-measured factor using Monte Carlo simulation by SAS. Mixed effect model was used to analyze the effect of the longitudinal cohort. The result was used as standard control. Simulation datasets with MCAR,MAR and NMAR under AMP and MMP conditions were constructed, and the missing rate was set to be 10% ,20%, 30% ,40% and 50%, respectively. Deletion method, single imputation method, multiple imputation method and EM method were carried out. The results were then compared with the standard control. Results For MCAR and MAR datasets with AMP, all the methods showed satisfactory results when the rate of missing data remained modest ( ~〈 10% ). However, as the percentage increased,the multiple imputation method appeared to be the only optimal strate- gy. In contrast, for MCAR and MAR datasets with MMP, only the regression method and the predicted mean matching method were efficacious. It has to be noted that multiple imputation method tended to overestimate the variation of regression coeffi- cients. In addition,the imputation methodology played a far more important role than the number of iterations in analyzing the data. For NMAR datasets,all attempted methods were unable to achieve satisfactory results. Conclusion The multiple imputa- tion method was proved desirable in dealing with missing data in longitudinal cohort.
出处
《中国卫生统计》
CSCD
北大核心
2016年第1期45-48,共4页
Chinese Journal of Health Statistics
关键词
纵向缺失资料
缺失模式
缺失机制
多重填补
Longitudinal missing data
Missing pattem
Missing mechanism
Multiple imputations