Demographic estimation becomes a problem of small area estimation when detaileddisaggregation leads to small cell counts.The usual difficulties of small area estimation are compounded when the available data sources c...Demographic estimation becomes a problem of small area estimation when detaileddisaggregation leads to small cell counts.The usual difficulties of small area estimation are compounded when the available data sources contain measurement errors.We present a Bayesianapproach to the problem of small area estimation with imperfect data sources.The overall modelcontains separate submodels for underlying demographic processes and for measurement processes.All unknown quantities in the model,including coverage ratios and demographic rates,are estimated jointly via Markov chain Monte Carlo methods.The approach is illustrated usingthe example of provincial fertility rates in Cambodia.展开更多
In this article,a new unit level model based on a pairwise penalised regression approach is proposed for problems in small area estimation(SAE).Instead of assuming common regression coefficients for all small domains ...In this article,a new unit level model based on a pairwise penalised regression approach is proposed for problems in small area estimation(SAE).Instead of assuming common regression coefficients for all small domains in the traditional model,the new estimator is based on a subgroup regression model which allows different regression coefficients in different groups.The alternating direction method of multipliers(ADMM)algorithm is used to find subgroups with different regression coefficients.We also consider pairwise spatial weights for spatial areal data.In the simulation study,we compare the performances of the new estimator with the traditional small area estimator.We also apply the new estimator to urban area estimation using data from the National Resources Inventory survey in Iowa.展开更多
<p> <span><span style="font-family:""><span style="font-family:Verdana;">Simulation (stochastic) methods are based on obtaining random samples </span><spa...<p> <span><span style="font-family:""><span style="font-family:Verdana;">Simulation (stochastic) methods are based on obtaining random samples </span><span style="color:#4F4F4F;font-family:Simsun;white-space:normal;background-color:#FFFFFF;"><span style="font-family:Verdana;">θ</span><sup><span style="font-family:Verdana;">5</span></sup></span><span style="font-family:Verdana;"></span><span style="font-family:Verdana;"> </span><span><span style="font-family:Verdana;"> </span><span><span style="font-family:Verdana;">from the desired distribution </span><em><span style="font-family:Verdana;">p</span></em><span style="font-family:Verdana;">(</span><span style="color:#4F4F4F;font-family:Verdana;white-space:normal;background-color:#FFFFFF;">θ</span><span style="font-family:Verdana;"></span><span style="font-family:Verdana;">)</span><span style="font-family:Verdana;"> </span><span style="font-family:Verdana;">and estimating the expectation of any </span></span><span><span style="font-family:Verdana;">function </span><em><span style="font-family:Verdana;">h</span></em><span style="font-family:Verdana;">(</span><span style="color:#4F4F4F;font-family:Verdana;white-space:normal;background-color:#FFFFFF;">θ</span><span style="font-family:Verdana;"></span><span style="font-family:Verdana;">)</span><span style="font-family:Verdana;">. Simulation methods can be used for high-dimensional dis</span></span><span style="font-family:Verdana;">tributions, and there are general algorithms which work for a wide variety of models. Markov chain Monte Carlo (MCMC) methods have been important </span><span style="font-family:Verdana;">in making Bayesian inference practical for generic hierarchical models in</span><span style="font-family:Verdana;"> small area estimation. Small area estimation is a method for producing reliable estimates for small areas. Model based Bayesian small area estimation methods are becoming popular for their ability to combine information from several sources as well as taking account of spatial prediction of spatial data. In this study, detailed simulation algorithm is given and the performance of a non-trivial extension of hierarchical Bayesian model for binary data under spatial misalignment is assessed. Both areal level and unit level latent processes were considered in modeling. The process models generated from the predictors were used to construct the basis so as to alleviate the problem of collinearity </span><span style="font-family:Verdana;">between the true predictor variables and the spatial random process. The</span><span style="font-family:Verdana;"> performance of the proposed model was assessed using MCMC simulation studies. The performance was evaluated with respect to root mean square error </span><span style="font-family:Verdana;">(RMSE), Mean absolute error (MAE) and coverage probability of corres</span><span style="font-family:Verdana;">ponding 95% CI of the estimate. The estimates from the proposed model perform better than the direct estimate.</span></span></span></span> </p> <p> <span></span> </p>展开更多
Generalized Linear Mixed Model (GLMM) has been widely used in small area estimation for health indicators. Bayesian estimation is usually used to construct statistical intervals, however, its computational intensity i...Generalized Linear Mixed Model (GLMM) has been widely used in small area estimation for health indicators. Bayesian estimation is usually used to construct statistical intervals, however, its computational intensity is a big challenge for large complex surveys. Frequentist approaches, such as bootstrapping, and Monte Carlo (MC) simulation, are also applied but not evaluated in terms of the interval magnitude, width, and the computational time consumed. The 2013 Florida Behavioral Risk Factor Surveillance System data was used as a case study. County-level estimated prevalence of three health-related outcomes was obtained through a GLMM;and their 95% confidence intervals (CIs) were generated from bootstrapping and MC simulation. The intervals were compared to 95% credential intervals through a hierarchial Bayesian model. The results showed that 95% CIs for county-level estimates of each outcome by using MC simulation were similar to the 95% credible intervals generated by Bayesian estimation and were the most computationally efficient. It could be a viable option for constructing statistical intervals for small area estimation in public health practice.展开更多
The linear mixed-effects model (LMM) is a very useful tool for analyzing cluster data. In practice, however, the exact values of the variables are often difficult to observe. In this paper, we consider the LMM with ...The linear mixed-effects model (LMM) is a very useful tool for analyzing cluster data. In practice, however, the exact values of the variables are often difficult to observe. In this paper, we consider the LMM with measurement errors in the covariates. The empirical BLUP estimator of the linear combination of the fixed and random effects and its approximate conditional MSE are derived. The application to the estimation of small area is provided. Simulation study shows good performance of the proposed estimators.展开更多
Three keynote lectures are presented at the conference of Small Area Estimation and OtherTopics of Current Interest in Surveys, Official Statistics and General Statistics (SAE 2018), an international conference held b...Three keynote lectures are presented at the conference of Small Area Estimation and OtherTopics of Current Interest in Surveys, Official Statistics and General Statistics (SAE 2018), an international conference held between June 16 and 18 at East China Normal University, Shanghai,China. The speakers of these lectures are world famous statistics professors, James O. Berger,J. N. K. Rao and Malay Ghosh. The lectures mainly review the previous studies and present thepioneering results covering Bayesian statistics, small area estimation, shrinkage priors, etc.展开更多
An interview with Professor Danny Pfeffermann is conducted during the conference of SmallArea Estimation 2018 (SAE 2018), an international conference held between June 16 and 18at East China Normal University, Shangha...An interview with Professor Danny Pfeffermann is conducted during the conference of SmallArea Estimation 2018 (SAE 2018), an international conference held between June 16 and 18at East China Normal University, Shanghai, China. SAE 2018 is also a celebration of ProfessorPfeffermann’s 75th birthday. Our interview consists of eight questions, which focus on Professor Pfeffermann’s personal education background, research motivations, contributions to thedevelopment of statistics, opinions on big data and data science, and his future plan. Professor Pfeffemann used interesting examples to express his opinions on the future development ofstatistics.展开更多
文摘Demographic estimation becomes a problem of small area estimation when detaileddisaggregation leads to small cell counts.The usual difficulties of small area estimation are compounded when the available data sources contain measurement errors.We present a Bayesianapproach to the problem of small area estimation with imperfect data sources.The overall modelcontains separate submodels for underlying demographic processes and for measurement processes.All unknown quantities in the model,including coverage ratios and demographic rates,are estimated jointly via Markov chain Monte Carlo methods.The approach is illustrated usingthe example of provincial fertility rates in Cambodia.
基金This research was supported in part by the Natural ResourcesConservation Service of the U.S. Department of Agriculture.
文摘In this article,a new unit level model based on a pairwise penalised regression approach is proposed for problems in small area estimation(SAE).Instead of assuming common regression coefficients for all small domains in the traditional model,the new estimator is based on a subgroup regression model which allows different regression coefficients in different groups.The alternating direction method of multipliers(ADMM)algorithm is used to find subgroups with different regression coefficients.We also consider pairwise spatial weights for spatial areal data.In the simulation study,we compare the performances of the new estimator with the traditional small area estimator.We also apply the new estimator to urban area estimation using data from the National Resources Inventory survey in Iowa.
文摘<p> <span><span style="font-family:""><span style="font-family:Verdana;">Simulation (stochastic) methods are based on obtaining random samples </span><span style="color:#4F4F4F;font-family:Simsun;white-space:normal;background-color:#FFFFFF;"><span style="font-family:Verdana;">θ</span><sup><span style="font-family:Verdana;">5</span></sup></span><span style="font-family:Verdana;"></span><span style="font-family:Verdana;"> </span><span><span style="font-family:Verdana;"> </span><span><span style="font-family:Verdana;">from the desired distribution </span><em><span style="font-family:Verdana;">p</span></em><span style="font-family:Verdana;">(</span><span style="color:#4F4F4F;font-family:Verdana;white-space:normal;background-color:#FFFFFF;">θ</span><span style="font-family:Verdana;"></span><span style="font-family:Verdana;">)</span><span style="font-family:Verdana;"> </span><span style="font-family:Verdana;">and estimating the expectation of any </span></span><span><span style="font-family:Verdana;">function </span><em><span style="font-family:Verdana;">h</span></em><span style="font-family:Verdana;">(</span><span style="color:#4F4F4F;font-family:Verdana;white-space:normal;background-color:#FFFFFF;">θ</span><span style="font-family:Verdana;"></span><span style="font-family:Verdana;">)</span><span style="font-family:Verdana;">. Simulation methods can be used for high-dimensional dis</span></span><span style="font-family:Verdana;">tributions, and there are general algorithms which work for a wide variety of models. Markov chain Monte Carlo (MCMC) methods have been important </span><span style="font-family:Verdana;">in making Bayesian inference practical for generic hierarchical models in</span><span style="font-family:Verdana;"> small area estimation. Small area estimation is a method for producing reliable estimates for small areas. Model based Bayesian small area estimation methods are becoming popular for their ability to combine information from several sources as well as taking account of spatial prediction of spatial data. In this study, detailed simulation algorithm is given and the performance of a non-trivial extension of hierarchical Bayesian model for binary data under spatial misalignment is assessed. Both areal level and unit level latent processes were considered in modeling. The process models generated from the predictors were used to construct the basis so as to alleviate the problem of collinearity </span><span style="font-family:Verdana;">between the true predictor variables and the spatial random process. The</span><span style="font-family:Verdana;"> performance of the proposed model was assessed using MCMC simulation studies. The performance was evaluated with respect to root mean square error </span><span style="font-family:Verdana;">(RMSE), Mean absolute error (MAE) and coverage probability of corres</span><span style="font-family:Verdana;">ponding 95% CI of the estimate. The estimates from the proposed model perform better than the direct estimate.</span></span></span></span> </p> <p> <span></span> </p>
文摘Generalized Linear Mixed Model (GLMM) has been widely used in small area estimation for health indicators. Bayesian estimation is usually used to construct statistical intervals, however, its computational intensity is a big challenge for large complex surveys. Frequentist approaches, such as bootstrapping, and Monte Carlo (MC) simulation, are also applied but not evaluated in terms of the interval magnitude, width, and the computational time consumed. The 2013 Florida Behavioral Risk Factor Surveillance System data was used as a case study. County-level estimated prevalence of three health-related outcomes was obtained through a GLMM;and their 95% confidence intervals (CIs) were generated from bootstrapping and MC simulation. The intervals were compared to 95% credential intervals through a hierarchial Bayesian model. The results showed that 95% CIs for county-level estimates of each outcome by using MC simulation were similar to the 95% credible intervals generated by Bayesian estimation and were the most computationally efficient. It could be a viable option for constructing statistical intervals for small area estimation in public health practice.
基金supported by National Natural Science Foundation of China(Grant No.11301514)partially supported by National Natural Science Foundation of China(Grant Nos.11271355 and 70625004)National Bureau of Statistics of China(Grant No.2012LZ012)
文摘The linear mixed-effects model (LMM) is a very useful tool for analyzing cluster data. In practice, however, the exact values of the variables are often difficult to observe. In this paper, we consider the LMM with measurement errors in the covariates. The empirical BLUP estimator of the linear combination of the fixed and random effects and its approximate conditional MSE are derived. The application to the estimation of small area is provided. Simulation study shows good performance of the proposed estimators.
文摘Three keynote lectures are presented at the conference of Small Area Estimation and OtherTopics of Current Interest in Surveys, Official Statistics and General Statistics (SAE 2018), an international conference held between June 16 and 18 at East China Normal University, Shanghai,China. The speakers of these lectures are world famous statistics professors, James O. Berger,J. N. K. Rao and Malay Ghosh. The lectures mainly review the previous studies and present thepioneering results covering Bayesian statistics, small area estimation, shrinkage priors, etc.
文摘An interview with Professor Danny Pfeffermann is conducted during the conference of SmallArea Estimation 2018 (SAE 2018), an international conference held between June 16 and 18at East China Normal University, Shanghai, China. SAE 2018 is also a celebration of ProfessorPfeffermann’s 75th birthday. Our interview consists of eight questions, which focus on Professor Pfeffermann’s personal education background, research motivations, contributions to thedevelopment of statistics, opinions on big data and data science, and his future plan. Professor Pfeffemann used interesting examples to express his opinions on the future development ofstatistics.