In analyzing data from clinical trials and longitudinal studies, the issue of missing values is always a fundamental challenge since the missing data could introduce bias and lead to erroneous statistical inferences. ...In analyzing data from clinical trials and longitudinal studies, the issue of missing values is always a fundamental challenge since the missing data could introduce bias and lead to erroneous statistical inferences. To deal with this challenge, several imputation methods have been developed in the literature to handle missing values where the most commonly used are complete case method, mean imputation method, last observation carried forward (LOCF) method, and multiple imputation (MI) method. In this paper, we conduct a simulation study to investigate the efficiency of these four typical imputation methods with longitudinal data setting under missing completely at random (MCAR). We categorize missingness with three cases from a lower percentage of 5% to a higher percentage of 30% and 50% missingness. With this simulation study, we make a conclusion that LOCF method has more bias than the other three methods in most situations. MI method has the least bias with the best coverage probability. Thus, we conclude that MI method is the most effective imputation method in our MCAR simulation study.展开更多
Missing data can frequently occur in a longitudinal data analysis. In the literature, many methods have been proposed to handle such an issue. Complete case (CC), mean substitution (MS), last observation carried forwa...Missing data can frequently occur in a longitudinal data analysis. In the literature, many methods have been proposed to handle such an issue. Complete case (CC), mean substitution (MS), last observation carried forward (LOCF), and multiple imputation (MI) are the four most frequently used methods in practice. In a real-world data analysis, the missing data can be MCAR, MAR, or MNAR depending on the reasons that lead to data missing. In this paper, simulations under various situations (including missing mechanisms, missing rates, and slope sizes) were conducted to evaluate the performance of the four methods considered using bias, RMSE, and 95% coverage probability as evaluation criteria. The results showed that LOCF has the largest bias and the poorest 95% coverage probability in most cases under both MAR and MCAR missing mechanisms. Hence, LOCF should not be used in a longitudinal data analysis. Under MCAR missing mechanism, CC and MI method are performed equally well. Under MAR missing mechanism, MI has the smallest bias, smallest RMSE, and best 95% coverage probability. Therefore, CC or MI method is the appropriate method to be used under MCAR while MI method is a more reliable and a better grounded statistical method to be used under MAR.展开更多
In longitudinal studies, measurements are taken repeatedly over time on the same experimental unit. These measurements are thus correlated. Missing data are very common in longitudinal studies. A lot of research has b...In longitudinal studies, measurements are taken repeatedly over time on the same experimental unit. These measurements are thus correlated. Missing data are very common in longitudinal studies. A lot of research has been going on ways to appropriately analyze such data set. Generalized Estimating Equations (GEE) is a popular method for the analysis of non-Gaussian longitudinal data. In the presence of missing data, GEE requires the strong assumption of missing completely at random (MCAR). Multiple Imputation Generalized Estimating Equations (MIGEE), Inverse Probability Weighted Generalized Estimating Equations (IPWGEE) and Double Robust Generalized Estimating Equations (DRGEE) have been proposed as elegant ways to ensure validity of the inference under missing at random (MAR). In this study, the three extensions of GEE are compared under various dropout rates and sample sizes through simulation studies. Under MAR and MCAR mechanism, the simulation results revealed better performance of DRGEE compared to IPWGEE and MIGEE. The optimum method was applied to real data set.展开更多
This article deals with some new chain imputation methods by using two auxiliary variables under missing completely at random(MCAR)approach.The proposed generalized classes of chain imputation methods are tested from ...This article deals with some new chain imputation methods by using two auxiliary variables under missing completely at random(MCAR)approach.The proposed generalized classes of chain imputation methods are tested from the viewpoint of optimality in terms of MSE.The proposed imputation methods can be considered as an efficient extension to the work of Singh and Horn(Metrika 51:267-276,2000),Singh and Deo(Stat Pap 44:555-579,2003),Singh(Stat A J Theor Appl Stat 43(5):499-511,2009),Kadilar and Cingi(Commun Stat Theory Methods 37:2226-2236,2008)and Diana and Perri(Commun Stat Theory Methods 39:3245-3251,2010).The performance of the proposed chain imputation methods is investigated relative to the conventional chain-type imputation methods.The theoretical results are derived and comparative study is conducted and the results are found to be quite encouraging providing the improvement over the discussed work.展开更多
文摘In analyzing data from clinical trials and longitudinal studies, the issue of missing values is always a fundamental challenge since the missing data could introduce bias and lead to erroneous statistical inferences. To deal with this challenge, several imputation methods have been developed in the literature to handle missing values where the most commonly used are complete case method, mean imputation method, last observation carried forward (LOCF) method, and multiple imputation (MI) method. In this paper, we conduct a simulation study to investigate the efficiency of these four typical imputation methods with longitudinal data setting under missing completely at random (MCAR). We categorize missingness with three cases from a lower percentage of 5% to a higher percentage of 30% and 50% missingness. With this simulation study, we make a conclusion that LOCF method has more bias than the other three methods in most situations. MI method has the least bias with the best coverage probability. Thus, we conclude that MI method is the most effective imputation method in our MCAR simulation study.
文摘Missing data can frequently occur in a longitudinal data analysis. In the literature, many methods have been proposed to handle such an issue. Complete case (CC), mean substitution (MS), last observation carried forward (LOCF), and multiple imputation (MI) are the four most frequently used methods in practice. In a real-world data analysis, the missing data can be MCAR, MAR, or MNAR depending on the reasons that lead to data missing. In this paper, simulations under various situations (including missing mechanisms, missing rates, and slope sizes) were conducted to evaluate the performance of the four methods considered using bias, RMSE, and 95% coverage probability as evaluation criteria. The results showed that LOCF has the largest bias and the poorest 95% coverage probability in most cases under both MAR and MCAR missing mechanisms. Hence, LOCF should not be used in a longitudinal data analysis. Under MCAR missing mechanism, CC and MI method are performed equally well. Under MAR missing mechanism, MI has the smallest bias, smallest RMSE, and best 95% coverage probability. Therefore, CC or MI method is the appropriate method to be used under MCAR while MI method is a more reliable and a better grounded statistical method to be used under MAR.
文摘In longitudinal studies, measurements are taken repeatedly over time on the same experimental unit. These measurements are thus correlated. Missing data are very common in longitudinal studies. A lot of research has been going on ways to appropriately analyze such data set. Generalized Estimating Equations (GEE) is a popular method for the analysis of non-Gaussian longitudinal data. In the presence of missing data, GEE requires the strong assumption of missing completely at random (MCAR). Multiple Imputation Generalized Estimating Equations (MIGEE), Inverse Probability Weighted Generalized Estimating Equations (IPWGEE) and Double Robust Generalized Estimating Equations (DRGEE) have been proposed as elegant ways to ensure validity of the inference under missing at random (MAR). In this study, the three extensions of GEE are compared under various dropout rates and sample sizes through simulation studies. Under MAR and MCAR mechanism, the simulation results revealed better performance of DRGEE compared to IPWGEE and MIGEE. The optimum method was applied to real data set.
文摘This article deals with some new chain imputation methods by using two auxiliary variables under missing completely at random(MCAR)approach.The proposed generalized classes of chain imputation methods are tested from the viewpoint of optimality in terms of MSE.The proposed imputation methods can be considered as an efficient extension to the work of Singh and Horn(Metrika 51:267-276,2000),Singh and Deo(Stat Pap 44:555-579,2003),Singh(Stat A J Theor Appl Stat 43(5):499-511,2009),Kadilar and Cingi(Commun Stat Theory Methods 37:2226-2236,2008)and Diana and Perri(Commun Stat Theory Methods 39:3245-3251,2010).The performance of the proposed chain imputation methods is investigated relative to the conventional chain-type imputation methods.The theoretical results are derived and comparative study is conducted and the results are found to be quite encouraging providing the improvement over the discussed work.