摘要
随机化区组设计中经常会碰到缺失数据,处理此类缺失数据目前有4种方法:删除缺失数据法、均值插补法、公式插补法和Yate’s插补法。4种方法的优劣是值得研究的一个问题,拟用模拟研究的方法对此4种方法进行比较。首先随机产生一个4×5的随机区组设计,令缺失值的个数m=1,…,6;其次对每个n遍历所有缺失值位置可能的组合,在每一个缺失值位置的组合下,分别研究4种方法线性回归的标准误差、可决系数和复可决系数。最后模拟研究的结果证实Yate’s插补方法是这4种方法中表现最好的一个,实例研究的结果也证实了模拟研究的结论。
In random experiment design, missing data often exist due to some reason. There are four methods to deal with the missing data : delete the missing data, mean imputation, formula imputation and Yate's imputation. It is an interesting question to compare the four methods. This article presents how to use simulation study to carry out this comparison. First, built a 4x5 random experiment de- sign; m denotes the numbers of missing data which equals from 1 to 6; Second, find out all missing values location combinations. For each combination, these 4 methods are executed separately, and standard error, square R and adjust square R for each method are re- corded. Last, the simulation study shows Yate's inputaton method performance is better than other 3 methods. The real example also proves simulation results.
基金
2012年大理学院青年教师科研基金资助项目(KYQN201219)