In stratified survey sampling, sometimes we have complete auxiliary information. One of the fundamental questions is how to effectively use the complete auxiliary information at the estimation stage. In this paper, we...In stratified survey sampling, sometimes we have complete auxiliary information. One of the fundamental questions is how to effectively use the complete auxiliary information at the estimation stage. In this paper, we extend the model-calibration method to obtain estimators of the finite population mean by using complete auxiliary information from stratified sampling survey data. We show that the resulting estimators effectively use auxiliary information at the estimation stage and possess a number of attractive features such as asymptotically design-unbiased irrespective of the working model and approximately model-unbiased under the model. When a linear working-model is used, the resulting estimators reduce to the usual calibration estimator(or GREG).展开更多
In July of 1987, the Sampling Survey of Children's Situation was conducted in 9 provincesautonomous regions of China. A stratified two--stage cluster sampling plan was designed for thesurvey. The paper presents th...In July of 1987, the Sampling Survey of Children's Situation was conducted in 9 provincesautonomous regions of China. A stratified two--stage cluster sampling plan was designed for thesurvey. The paper presents the methods of stratification, selecting n=2 PSU's (cities/counties) withunequal probabilities without replacement in each stratum and selecting residents/village committeein each sampled city/county. All formulae of estimating population characteristics (especiallypopulation totals and the ratios of two totals), and estimating variances of those estimators aregiven. Finally, we analyse the precision of the survey preliminarily from the result of dataprocessing.展开更多
This paper proposes a new method for increasing the precision in survey sam- pling, i.e., a method combining sampling with prediction. The two cases where auxiliary information is or not available are considered. A nu...This paper proposes a new method for increasing the precision in survey sam- pling, i.e., a method combining sampling with prediction. The two cases where auxiliary information is or not available are considered. A numerical example is given.展开更多
Fishery-independent surveys are often used for collecting high quality biological and ecological data to support fisheries management. A careful optimization of fishery-independent survey design is necessary to improv...Fishery-independent surveys are often used for collecting high quality biological and ecological data to support fisheries management. A careful optimization of fishery-independent survey design is necessary to improve the precision of survey estimates with cost-effective sampling efforts. We developed a simulation approach to evaluate and optimize the stratification scheme for a fishery-independent survey with multiple goals including estimation of abundance indices of individual species and species diversity indices. We compared the performances of the sampling designs with different stratification schemes for different goals over different months. Gains in precision of survey estimates from the stratification schemes were acquired compared to simple random sampling design for most indices. The stratification scheme with five strata performed the best. This study showed that the loss of precision of survey estimates due to the reduction of sampling efforts could be compensated by improved stratification schemes, which would reduce the cost and negative impacts of survey trawling on those species with low abundance in the fishery-independent survey. This study also suggests that optimization of a survey design differed with different survey objectives. A post-survey analysis can improve the stratification scheme of fishery-independent survey designs.展开更多
Data from the 2013 Canadian Tobacco, Alcohol and Drugs Survey, and two other surveys are used to determine the effects of cannabis use on self-reported physical and mental health. Daily or almost daily marijuana use i...Data from the 2013 Canadian Tobacco, Alcohol and Drugs Survey, and two other surveys are used to determine the effects of cannabis use on self-reported physical and mental health. Daily or almost daily marijuana use is shown to be detrimental to both measures of health for some age groups but not all. The age group specific effects depend on gender. Males and females respond differently to cannabis use. The health costs of regularly using cannabis are significant but they are much smaller than those associated with tobacco use. These costs are attributed to both the presence of delta9-tetrahydrocannabinol and the fact that smoking cannabis is itself a health hazard because of the toxic properties of the smoke ingested. Cannabis use is costlier to regular smokers and age of first use below the age of 15 or 20 and being a former user leads to reduced physical and mental capacities which are permanent. These results strongly suggest that the legalization of marijuana be accompanied by educational programs, counseling services, and a delivery system, which minimizes juvenile and young adult usage.展开更多
In this paper, auxiliary information is used to determine an estimator of finite population total using nonparametric regression under stratified random sampling. To achieve this, a model-based approach is adopted by ...In this paper, auxiliary information is used to determine an estimator of finite population total using nonparametric regression under stratified random sampling. To achieve this, a model-based approach is adopted by making use of the local polynomial regression estimation to predict the nonsampled values of the survey variable y. The performance of the proposed estimator is investigated against some design-based and model-based regression estimators. The simulation experiments show that the resulting estimator exhibits good properties. Generally, good confidence intervals are seen for the nonparametric regression estimators, and use of the proposed estimator leads to relatively smaller values of RE compared to other estimators.展开更多
In the field work of populationbased research, 3 groups of eyes were graded by 2 observers in LOCS Ⅱ. The reproducibility of LOCS Ⅱwas evaluated by agreements(85%-100%) and k values(0.661-1) obtained in our study. T...In the field work of populationbased research, 3 groups of eyes were graded by 2 observers in LOCS Ⅱ. The reproducibility of LOCS Ⅱwas evaluated by agreements(85%-100%) and k values(0.661-1) obtained in our study. The satisfying results show that LOCS Ⅱis not only easy to be learned and to be applied consistently by different observers, but also good reproducibility in the field work. The longitudinal cataract study is going to be performed in our plan.展开更多
AIM To analyze the incidence of hepatocellular carcinoma (HCC) in a population that underwent health checkups and had high serum miR-106b levels. METHODS A total of 335 subjects who underwent checkups in the Digestive...AIM To analyze the incidence of hepatocellular carcinoma (HCC) in a population that underwent health checkups and had high serum miR-106b levels. METHODS A total of 335 subjects who underwent checkups in the Digestive and Liver Disease Department of our hospital were randomly selected. RT-PCR was used to detect the level of miR-106b in serum samples. Laboratory and imaging examinations were carried out to confirm the HCC diagnosis in patients who had a > 2-fold change in miR-106b levels. Ultrasound-guided biopsy was also used for HCC diagnosis when necessary. On this basis, the clinical data of these subjects, including history of hepatitis virus infection, obesity, long-term history of alcohol use and stage of HCC, were collected. Then, the impact of these factors on the level of miR1-06b in serum was analyzed. Furthermore, receiver operating characteristic (ROC) curve was drawn to evaluate the diagnostic efficacy of miR-106b for HCC. RESULTS A total of 35 subjects had abnormal serum miR-106b levels, of which 20 subjects were diagnosed with HCC. t-test revealed that the difference in serum miR-106b level in terms of sex, age, history of hepatitis virus infection, obesity and long-term history of alcohol use was not statistically significant. However, serum miR-106b levels in patients with advanced HCC (stage. /.) was higher than in patients with early HCC (stage./.), and the difference was statistically significant (P = 0.000). Moreover, the ROC curve revealed that the area under the curve value for miR-106b was 0.885, which shows that serum miR-106b level has a certain clinical value for HCC diagnosis. CONCLUSION The random sampling survey shows that serum miR-106b level is a valuable diagnostic marker for HCC. However, the diagnostic threshold value needs to be further researched.展开更多
The goals of any major business transformation programme in an official statistical agency often include improving data collection efficiency,data processing methodologies and data quality.However,the achievement of s...The goals of any major business transformation programme in an official statistical agency often include improving data collection efficiency,data processing methodologies and data quality.However,the achievement of such improvements may have transitional statistical impacts that could be misinterpreted as real-world changes if they are not measured and handled appropriately.This paper describes a development work that sought to explore the design and analysis of a times-series experiment that measured the statistical impacts that sometimes occur during survey redesigns.The Labour Force Survey(LFS)of the Australian Bureau of Statistics(ABS)was used as a case study.In the present study:(1)A large-scale field experiment was designed and conducted that allowed the outgoing and the incoming surveys to run in parallel for some periods to measure the impacts of any changes to the survey process;and(2)The precision of the impact measurement was continuously improved while the new survey design was being implemented.The state space modelling(SSM)technique was adopted as the main approach,as it provides an efficient impact measurement.This approach enabled sampling error structure to be incorporated in the time-series intervention analysis.The approach was also able to be extended to take advantage of the availability of other related data sources(e.g.,the data obtained from the parallel data collection process)to improve the efficiency and accuracy of the impact measurement.As stated above,the LFS was used as a case study;however,the models and methods developed in this study could be extended to other surveys.展开更多
Driven by urbanization and industrialization,arable land in hilly and mountainous regions of China is gradually becoming marginalized,with the extent of arable land abandonment rapidly expanding from poor-quality slop...Driven by urbanization and industrialization,arable land in hilly and mountainous regions of China is gradually becoming marginalized,with the extent of arable land abandonment rapidly expanding from poor-quality sloping arable land to high-quality terraces.The abandonment of large-scale terraces will lead to a series of socio-economic and ecological effects.A national sample survey was used to investigate the extent and spatial distribution of terrace abandonment in China,and a total of 560 valid village questionnaires from 329 counties were collected in the mountainous areas of China.The main findings are as follows:(1)The phenomenon of terrace abandonment was widespread throughout the country,with 54% of the total surveyed villages exhibiting terrace abandonment,and the area of abandoned terraces accounting for 9.79% of the total.(2) The degree of terrace abandonment is high in the south and low in the north.The most serious region with abandonment was the hilly and mountainous areas in the south,especially in the middle and lower Yangtze River region.(3)The main driving factors of terrace abandonment were rural labor migration,agricultural mechanization level,irrigation conditions,and transportation conditions for cultivation.Targeted measures should be taken based on the specific conditions of each area to alleviate terrace abandonment.Measures such as improving terrace mechanization are universally applicable.Specifically,low-quality terraces can be withdrawn orderly,and for high-quality terraces,multiple measures are needed to consolidate agricultural production,such as adjusting the planting structure,strengthening agricultural infrastructure construction,and encouraging the transfer of land-use rights as well as large-scale operation.展开更多
Objective:To compare the adverse maternal and neonatal outcomes of multiple pregnancy and singleton pregnancy from multiple medical centers in Beijing.Methods:Data concerning maternal and neonatal adverse outcomes in ...Objective:To compare the adverse maternal and neonatal outcomes of multiple pregnancy and singleton pregnancy from multiple medical centers in Beijing.Methods:Data concerning maternal and neonatal adverse outcomes in multiple and singleton pregnancies were collected from 15 hospitals in Beijing by a systemic cluster sampling survey conducted from 20 June to 30 November 2013.The SPSS software (version 20.0) was used for data analysis.The x2 test was used tbr statistical analyses.Results:The rate of caesarean deliveries was much higher in women with multiple pregnancies (85.8%) than that in women with singleton pregnancies (42.6%,X2 =190.8,P < 0.001).The incidences of anemia (X2 =40.023,P < 0.001),preterm labor (X2 =1021.172,P < 0.001),gestational diabetes mellitus (X2 =9.311,P < 0.01),hypertensive disorders (X2 =122.708,P < 0.001)and post-partum hemorrhage (X2-48.550,P < 0.001) was significantly increased with multiple pregnancy.In addition,multiple pregnancy was associated with a significantly higher rate of small-for-gestational-age infants (X2 =92.602,P < 0.001),low birth weight (X2 =1141.713,P < 0.001),and neonatal intensive care unit (NICU) admission (X2 =340.129,P< 0.001).Conclusions:Multiple pregnancy is a significant risk factor for adverse maternal and neonatal outcomes in Beijing.Improving obstetric care for multiple pregnancy,particularly in reducing preterm labor,is required to reduce the risk to mothers and infants.展开更多
This paper considers the admissibility of the estimators for finite population when the parameter space is restricted. We obtain all admissible linear estimators of an arbitrary linear function of characteristic value...This paper considers the admissibility of the estimators for finite population when the parameter space is restricted. We obtain all admissible linear estimators of an arbitrary linear function of characteristic values of a finite population in the class of linear estimators under the criterion of the expectation of mean souared error.展开更多
Few studies focus on the application of functional data to the field of design-based survey sampling.In this paper,the scalar-onunction regression model-assisted method is proposed to estimate the finite population me...Few studies focus on the application of functional data to the field of design-based survey sampling.In this paper,the scalar-onunction regression model-assisted method is proposed to estimate the finite population means with auxiliary functional data information.The functional principal component method is used for the estimation of functional linear regression model.Our proposed functional linear regression model-assisted(FLR-assisted)estimator is asymptotically design-unbiased,consistent under mild conditions.Simulation experiments and real data analysis show that the FLR-assisted estimators are more efficient than the Horvitz-Thompson estimators under different sampling designs.展开更多
基金Supported by the National Natural Science Foundation of China(10571093)
文摘In stratified survey sampling, sometimes we have complete auxiliary information. One of the fundamental questions is how to effectively use the complete auxiliary information at the estimation stage. In this paper, we extend the model-calibration method to obtain estimators of the finite population mean by using complete auxiliary information from stratified sampling survey data. We show that the resulting estimators effectively use auxiliary information at the estimation stage and possess a number of attractive features such as asymptotically design-unbiased irrespective of the working model and approximately model-unbiased under the model. When a linear working-model is used, the resulting estimators reduce to the usual calibration estimator(or GREG).
基金Supported partially by the National Funds of Natural Sciences, 7860013
文摘In July of 1987, the Sampling Survey of Children's Situation was conducted in 9 provincesautonomous regions of China. A stratified two--stage cluster sampling plan was designed for thesurvey. The paper presents the methods of stratification, selecting n=2 PSU's (cities/counties) withunequal probabilities without replacement in each stratum and selecting residents/village committeein each sampled city/county. All formulae of estimating population characteristics (especiallypopulation totals and the ratios of two totals), and estimating variances of those estimators aregiven. Finally, we analyse the precision of the survey preliminarily from the result of dataprocessing.
基金Supported by the National Natural Science Foundation of China
文摘This paper proposes a new method for increasing the precision in survey sam- pling, i.e., a method combining sampling with prediction. The two cases where auxiliary information is or not available are considered. A numerical example is given.
基金The Public Science and Technology Research Funds Projects of Ocean under contract No.201305030the Specialized Research Fund for the Doctoral Program of Higher Education under contract No.20120132130001
文摘Fishery-independent surveys are often used for collecting high quality biological and ecological data to support fisheries management. A careful optimization of fishery-independent survey design is necessary to improve the precision of survey estimates with cost-effective sampling efforts. We developed a simulation approach to evaluate and optimize the stratification scheme for a fishery-independent survey with multiple goals including estimation of abundance indices of individual species and species diversity indices. We compared the performances of the sampling designs with different stratification schemes for different goals over different months. Gains in precision of survey estimates from the stratification schemes were acquired compared to simple random sampling design for most indices. The stratification scheme with five strata performed the best. This study showed that the loss of precision of survey estimates due to the reduction of sampling efforts could be compensated by improved stratification schemes, which would reduce the cost and negative impacts of survey trawling on those species with low abundance in the fishery-independent survey. This study also suggests that optimization of a survey design differed with different survey objectives. A post-survey analysis can improve the stratification scheme of fishery-independent survey designs.
文摘Data from the 2013 Canadian Tobacco, Alcohol and Drugs Survey, and two other surveys are used to determine the effects of cannabis use on self-reported physical and mental health. Daily or almost daily marijuana use is shown to be detrimental to both measures of health for some age groups but not all. The age group specific effects depend on gender. Males and females respond differently to cannabis use. The health costs of regularly using cannabis are significant but they are much smaller than those associated with tobacco use. These costs are attributed to both the presence of delta9-tetrahydrocannabinol and the fact that smoking cannabis is itself a health hazard because of the toxic properties of the smoke ingested. Cannabis use is costlier to regular smokers and age of first use below the age of 15 or 20 and being a former user leads to reduced physical and mental capacities which are permanent. These results strongly suggest that the legalization of marijuana be accompanied by educational programs, counseling services, and a delivery system, which minimizes juvenile and young adult usage.
文摘In this paper, auxiliary information is used to determine an estimator of finite population total using nonparametric regression under stratified random sampling. To achieve this, a model-based approach is adopted by making use of the local polynomial regression estimation to predict the nonsampled values of the survey variable y. The performance of the proposed estimator is investigated against some design-based and model-based regression estimators. The simulation experiments show that the resulting estimator exhibits good properties. Generally, good confidence intervals are seen for the nonparametric regression estimators, and use of the proposed estimator leads to relatively smaller values of RE compared to other estimators.
文摘In the field work of populationbased research, 3 groups of eyes were graded by 2 observers in LOCS Ⅱ. The reproducibility of LOCS Ⅱwas evaluated by agreements(85%-100%) and k values(0.661-1) obtained in our study. The satisfying results show that LOCS Ⅱis not only easy to be learned and to be applied consistently by different observers, but also good reproducibility in the field work. The longitudinal cataract study is going to be performed in our plan.
文摘AIM To analyze the incidence of hepatocellular carcinoma (HCC) in a population that underwent health checkups and had high serum miR-106b levels. METHODS A total of 335 subjects who underwent checkups in the Digestive and Liver Disease Department of our hospital were randomly selected. RT-PCR was used to detect the level of miR-106b in serum samples. Laboratory and imaging examinations were carried out to confirm the HCC diagnosis in patients who had a > 2-fold change in miR-106b levels. Ultrasound-guided biopsy was also used for HCC diagnosis when necessary. On this basis, the clinical data of these subjects, including history of hepatitis virus infection, obesity, long-term history of alcohol use and stage of HCC, were collected. Then, the impact of these factors on the level of miR1-06b in serum was analyzed. Furthermore, receiver operating characteristic (ROC) curve was drawn to evaluate the diagnostic efficacy of miR-106b for HCC. RESULTS A total of 35 subjects had abnormal serum miR-106b levels, of which 20 subjects were diagnosed with HCC. t-test revealed that the difference in serum miR-106b level in terms of sex, age, history of hepatitis virus infection, obesity and long-term history of alcohol use was not statistically significant. However, serum miR-106b levels in patients with advanced HCC (stage. /.) was higher than in patients with early HCC (stage./.), and the difference was statistically significant (P = 0.000). Moreover, the ROC curve revealed that the area under the curve value for miR-106b was 0.885, which shows that serum miR-106b level has a certain clinical value for HCC diagnosis. CONCLUSION The random sampling survey shows that serum miR-106b level is a valuable diagnostic marker for HCC. However, the diagnostic threshold value needs to be further researched.
文摘The goals of any major business transformation programme in an official statistical agency often include improving data collection efficiency,data processing methodologies and data quality.However,the achievement of such improvements may have transitional statistical impacts that could be misinterpreted as real-world changes if they are not measured and handled appropriately.This paper describes a development work that sought to explore the design and analysis of a times-series experiment that measured the statistical impacts that sometimes occur during survey redesigns.The Labour Force Survey(LFS)of the Australian Bureau of Statistics(ABS)was used as a case study.In the present study:(1)A large-scale field experiment was designed and conducted that allowed the outgoing and the incoming surveys to run in parallel for some periods to measure the impacts of any changes to the survey process;and(2)The precision of the impact measurement was continuously improved while the new survey design was being implemented.The state space modelling(SSM)technique was adopted as the main approach,as it provides an efficient impact measurement.This approach enabled sampling error structure to be incorporated in the time-series intervention analysis.The approach was also able to be extended to take advantage of the availability of other related data sources(e.g.,the data obtained from the parallel data collection process)to improve the efficiency and accuracy of the impact measurement.As stated above,the LFS was used as a case study;however,the models and methods developed in this study could be extended to other surveys.
基金Key Program of National Natural Science Foundation of China,No.41930757GDAS’ Project of Science and Technology Development,No.2020GDASYL-20200104005。
文摘Driven by urbanization and industrialization,arable land in hilly and mountainous regions of China is gradually becoming marginalized,with the extent of arable land abandonment rapidly expanding from poor-quality sloping arable land to high-quality terraces.The abandonment of large-scale terraces will lead to a series of socio-economic and ecological effects.A national sample survey was used to investigate the extent and spatial distribution of terrace abandonment in China,and a total of 560 valid village questionnaires from 329 counties were collected in the mountainous areas of China.The main findings are as follows:(1)The phenomenon of terrace abandonment was widespread throughout the country,with 54% of the total surveyed villages exhibiting terrace abandonment,and the area of abandoned terraces accounting for 9.79% of the total.(2) The degree of terrace abandonment is high in the south and low in the north.The most serious region with abandonment was the hilly and mountainous areas in the south,especially in the middle and lower Yangtze River region.(3)The main driving factors of terrace abandonment were rural labor migration,agricultural mechanization level,irrigation conditions,and transportation conditions for cultivation.Targeted measures should be taken based on the specific conditions of each area to alleviate terrace abandonment.Measures such as improving terrace mechanization are universally applicable.Specifically,low-quality terraces can be withdrawn orderly,and for high-quality terraces,multiple measures are needed to consolidate agricultural production,such as adjusting the planting structure,strengthening agricultural infrastructure construction,and encouraging the transfer of land-use rights as well as large-scale operation.
文摘Objective:To compare the adverse maternal and neonatal outcomes of multiple pregnancy and singleton pregnancy from multiple medical centers in Beijing.Methods:Data concerning maternal and neonatal adverse outcomes in multiple and singleton pregnancies were collected from 15 hospitals in Beijing by a systemic cluster sampling survey conducted from 20 June to 30 November 2013.The SPSS software (version 20.0) was used for data analysis.The x2 test was used tbr statistical analyses.Results:The rate of caesarean deliveries was much higher in women with multiple pregnancies (85.8%) than that in women with singleton pregnancies (42.6%,X2 =190.8,P < 0.001).The incidences of anemia (X2 =40.023,P < 0.001),preterm labor (X2 =1021.172,P < 0.001),gestational diabetes mellitus (X2 =9.311,P < 0.01),hypertensive disorders (X2 =122.708,P < 0.001)and post-partum hemorrhage (X2-48.550,P < 0.001) was significantly increased with multiple pregnancy.In addition,multiple pregnancy was associated with a significantly higher rate of small-for-gestational-age infants (X2 =92.602,P < 0.001),low birth weight (X2 =1141.713,P < 0.001),and neonatal intensive care unit (NICU) admission (X2 =340.129,P< 0.001).Conclusions:Multiple pregnancy is a significant risk factor for adverse maternal and neonatal outcomes in Beijing.Improving obstetric care for multiple pregnancy,particularly in reducing preterm labor,is required to reduce the risk to mothers and infants.
基金Supported by the National Natural Science Foundation of China
文摘This paper considers the admissibility of the estimators for finite population when the parameter space is restricted. We obtain all admissible linear estimators of an arbitrary linear function of characteristic values of a finite population in the class of linear estimators under the criterion of the expectation of mean souared error.
基金China Postdoctoral Science Foundation(Grant Nos.2021M691443,2021TQ0141)SUSTC Presidential Postdoctoral Fellow-ship.Huiming Zhang was supported in part by the University of Macao under UM Macao Talent Programme(UMMTP-2020-01).
文摘Few studies focus on the application of functional data to the field of design-based survey sampling.In this paper,the scalar-onunction regression model-assisted method is proposed to estimate the finite population means with auxiliary functional data information.The functional principal component method is used for the estimation of functional linear regression model.Our proposed functional linear regression model-assisted(FLR-assisted)estimator is asymptotically design-unbiased,consistent under mild conditions.Simulation experiments and real data analysis show that the FLR-assisted estimators are more efficient than the Horvitz-Thompson estimators under different sampling designs.