摘要
目的研究不同缺失率、不同缺失机制下,MICE(multivariate imputation by chained equations)多重填补的效果,探讨该填补方法的适用情况。方法依托某现况调查的完全数据,使用R软件构造不同缺失率、不同缺失机制的缺失数据。计算列表删除和MICE多重填补后分析结果的标准偏倚,并进行比较。单独对分类变量计算多重填补后的平均错分率。结果在单变量缺失率分别为10%、20%和30%的随机缺失三种情况下,MICE多重填补表现优良;其他模拟情况下,MICE多重填补相比于列表删除并未表现出明显的优势。对于分类变量,MICE填补后的平均错分率均超过60%。结论对于随机缺失数据,且单变量缺失率不超过30%时,建议采用MICE多重填补进行处理;但对于资料中的分类变量,不建议直接引用MICE填补后的具体数值。
Objective To evaluate the effects of multivariate imputation by chained equations (MICE) for data with dif- ferent missing mechanisms and various missing proportions,and explore the application situations of this method. Methods A complete dataset from a cross-sectional study was used to simulate missing datasets with different missing mechanisms and vari- ous missing proportions by R software. The standard bias of the incomplete datasets obtained by listwise deletion was compared with that of the imputed datasets obtained by MICE. Additionally, for binomial variable, the average misclassification ratio was calculated. Results MICE performed well for "missing at random" data with the univariate missing proportion of 10% ,20% and 30%. In other scenarios, MICE failed to show advantage over listwise deletion. For binomial variable, the average misclassi- fication ratios were more than 60%. Conclusion When the data was missing at random and the univariate missing proportion was no more than 30% ,MICE was recommended to use,but the imputed value in binomial variable was not suggested to be re- presented in raw data directly.
出处
《中国卫生统计》
CSCD
北大核心
2015年第4期580-584,共5页
Chinese Journal of Health Statistics
基金
山东省科技发展计划(No.2014GGH218019)