Variable selection is one of the most fundamental problems in regression analysis. By sampling from the posterior distributions of candidate models, Bayesian variable selection via MCMC (Markov chain Monte-Carlo) is...Variable selection is one of the most fundamental problems in regression analysis. By sampling from the posterior distributions of candidate models, Bayesian variable selection via MCMC (Markov chain Monte-Carlo) is effective to overcome the computational burden of all-subset variable selection approaches. However, the convergence of the MCMC is often hard to determine and one is often not sure about if obtained samples are unbiased. This complication has limited the application of Bayesian variable selection in practice. Based on the idea of CFTP (coupling from the past), perfect sampling schemes have been developed to obtain independent samples from the posterior distribution for a variety of problems. Here the authors propose an efficient and effective perfect sampling algorithm for Bayesian variable selection of linear regression models, which independently and identically sample from the posterior distribution of the model space and can efficiently handle thousands of variables. The effectiveness of the authors' algorithm is illustrated by three simulation studies, which have up to thousands of variables, the authors' method is further illustrated in SNPs (single nucleotide polymorphisms) association study among RA (rheumatoid arthritis) patients.展开更多
文摘Variable selection is one of the most fundamental problems in regression analysis. By sampling from the posterior distributions of candidate models, Bayesian variable selection via MCMC (Markov chain Monte-Carlo) is effective to overcome the computational burden of all-subset variable selection approaches. However, the convergence of the MCMC is often hard to determine and one is often not sure about if obtained samples are unbiased. This complication has limited the application of Bayesian variable selection in practice. Based on the idea of CFTP (coupling from the past), perfect sampling schemes have been developed to obtain independent samples from the posterior distribution for a variety of problems. Here the authors propose an efficient and effective perfect sampling algorithm for Bayesian variable selection of linear regression models, which independently and identically sample from the posterior distribution of the model space and can efficiently handle thousands of variables. The effectiveness of the authors' algorithm is illustrated by three simulation studies, which have up to thousands of variables, the authors' method is further illustrated in SNPs (single nucleotide polymorphisms) association study among RA (rheumatoid arthritis) patients.