摘要
目的 简要介绍R 环境下MICE填补方法(Multivariate imputation by chained equations)的填补估算应用并评价其填补效果.方法以实际数据阐述填补估算流程,比较MICE与常见的缺失数据处理方法(删除法、均(众)数法、回归法)填补估算效果的差异.结果当数据缺失率为10%时,MICE与常见的缺失数据处理方法估算结果无明显差异,各填补方法的3种变量的回归系数估计的相对误差在10%左右.随着缺失率的增加(20%,40%),各方法回归系数估计的相对误差都增加,但MICE 3种变量的回归系数的相对误差稳定在10%~20%左右,MICE表现优于其他方法而且结果稳定,回归法次之,删除法和均(众)数法较差.当缺失率达50%时,3种类型的变量估算的误差已经较大,所有方法填补估算效果欠佳.结论 MICE较其他多重填补软件操作简便,与常见的缺失数据处理方法相比,可充分地利用缺失记录的信息,能较准确地反应调查的真实情况,值得在实际工作中推广应用.
Objective To introduce briefly the basic R procedure of MICE( Multivariate imputation by chained equa- tions) in imputing incomplete multivariate data, and to assess the imputation effects of MICE. Methods Based on real data sets with missing variables and different missing rate, we introduce R procedure of MICE and compare the imputation results between MICE and common methods, including deletion method, conditional mean (mode) imputation method, and regression method. Results There are no obvious differences for these methods as missing rate is 10%, the relative error of three kinds of variables for all methods is around 10%. As the missing rate is increasing by between 20% to 40%, the relative error of parameter esti- mate is also increasing,but the relative error of three kinds of variables for MICE is around 10% -20%. MICE is superior to oth- er methods and has stable performance, and regression method is prefer to delete and mean (mode) method. Whereas missing rate is more than 50%, neither is appropriate. Conclusion MICE is attractive than other multiple imputation soft for its easy and simple usage. Compared with common methods, MICE provides better effects in higher missing rate and is worth using widely in incomplete multivariate data.
出处
《中国医院统计》
2011年第4期309-312,共4页
Chinese Journal of Hospital Statistics
关键词
MICE
多重填补
缺失数据
多变量分析
MICE Multivariate imputation Missing data Multivariable analysis