期刊文献+

基于数据不同缺失率的插补方法比较

Comparison of imputation methods based on different missing rates of data
下载PDF
导出
摘要 缺失数据是一项不可忽略的问题,缺失比例较高时会严重影响试验结果,所以处理缺失数据尤为重要。针对该问题,引用R内置数据集,采用成列删除、单一插补和多重插补方法,在数据缺失10%、25%、50%和75%的条件下,用这三种方法作插补,对插补后的数据做统计检验,结果表明,当缺失比例较小时,三种方法的结果相差不大,可以选择较简便的成列删除。随着缺失比例增加,不同方法的结果相差逐渐增大,相对稳定的方法为多重插补,结果与真实数据相差较小。 Missing data is a problem that cannot be ignored.When the proportion of missing data is high,it will seriously affect the test results,so it is particularly important to deal with missing data.To address this issue,the built-in data set of R is used,and the column deletion,single imputation and multiple imputation methods are used.Under the conditions that the data is missing 10%,25%,50%and 75%,the three methods are used for imputation to perform a statistical test on the imputed data.The results show that when the missing proportion is small,the results of the three methods do not differ much,and the simpler column deletion can be chosen.However,as the proportion of missing increases,the difference between the results of different methods gradually increases,and the relatively stable method is multiple imputation,and the difference between the results and the real data is smaller.
作者 席梦瑶 赖俊峰 张改梅 XI Mengyao;LAI Junfeng;ZHANG Gaimei(School of Sciences,Inner Mongolia University of Technology,Hohhot 010051,China;Hohhot No.1 Hospital,Hohhot 010030,China)
出处 《内蒙古工业大学学报(自然科学版)》 2023年第5期391-395,共5页 Journal of Inner Mongolia University of Technology:Natural Science Edition
基金 内蒙古自治区教育规划课题(NGJGH2021094) 内蒙古工业大学博士基金项目(BS201930) 内蒙古自治区直属高校基本科研业务费项目(JY20220190)。
关键词 列表删除 单一插补 多重插补 R column deletion single imputation multiple imputation R
  • 相关文献

参考文献6

二级参考文献35

共引文献102

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部