The aim of this study is to investigate the impacts of the sampling strategy of landslide and non-landslide on the performance of landslide susceptibility assessment(LSA).The study area is the Feiyun catchment in Wenz...The aim of this study is to investigate the impacts of the sampling strategy of landslide and non-landslide on the performance of landslide susceptibility assessment(LSA).The study area is the Feiyun catchment in Wenzhou City,Southeast China.Two types of landslides samples,combined with seven non-landslide sampling strategies,resulted in a total of 14 scenarios.The corresponding landslide susceptibility map(LSM)for each scenario was generated using the random forest model.The receiver operating characteristic(ROC)curve and statistical indicators were calculated and used to assess the impact of the dataset sampling strategy.The results showed that higher accuracies were achieved when using the landslide core as positive samples,combined with non-landslide sampling from the very low zone or buffer zone.The results reveal the influence of landslide and non-landslide sampling strategies on the accuracy of LSA,which provides a reference for subsequent researchers aiming to obtain a more reasonable LSM.展开更多
A composite random variable is a product (or sum of products) of statistically distributed quantities. Such a variable can represent the solution to a multi-factor quantitative problem submitted to a large, diverse, i...A composite random variable is a product (or sum of products) of statistically distributed quantities. Such a variable can represent the solution to a multi-factor quantitative problem submitted to a large, diverse, independent, anonymous group of non-expert respondents (the “crowd”). The objective of this research is to examine the statistical distribution of solutions from a large crowd to a quantitative problem involving image analysis and object counting. Theoretical analysis by the author, covering a range of conditions and types of factor variables, predicts that composite random variables are distributed log-normally to an excellent approximation. If the factors in a problem are themselves distributed log-normally, then their product is rigorously log-normal. A crowdsourcing experiment devised by the author and implemented with the assistance of a BBC (British Broadcasting Corporation) television show, yielded a sample of approximately 2000 responses consistent with a log-normal distribution. The sample mean was within ~12% of the true count. However, a Monte Carlo simulation (MCS) of the experiment, employing either normal or log-normal random variables as factors to model the processes by which a crowd of 1 million might arrive at their estimates, resulted in a visually perfect log-normal distribution with a mean response within ~5% of the true count. The results of this research suggest that a well-modeled MCS, by simulating a sample of responses from a large, rational, and incentivized crowd, can provide a more accurate solution to a quantitative problem than might be attainable by direct sampling of a smaller crowd or an uninformed crowd, irrespective of size, that guesses randomly.展开更多
On the basis of the principles of simple random sampling, the statistical model of rate of disfigurement (RD) is put forward and described in detail. According to the definition of simple random sampling for the attri...On the basis of the principles of simple random sampling, the statistical model of rate of disfigurement (RD) is put forward and described in detail. According to the definition of simple random sampling for the attribute data in GIS, the mean and variance of the RD are deduced as the characteristic value of the statistical model in order to explain the feasibility of the accuracy measurement of the attribute data in GIS by using the RD. Moreover, on the basis of the mean and variance of the RD, the quality assessment method for attribute data of vector maps during the data collecting is discussed. The RD spread graph is also drawn to see whether the quality of the attribute data is under control. The RD model can synthetically judge the quality of attribute data, which is different from other measurement coefficients that only discuss accuracy of classification.展开更多
Objective:To reveal the distribution characteristics and demographic factors of traditional Chinese medicine(TCM)constitution among elderly individuals in China.Methods: Elderly individuals from seven regions in China...Objective:To reveal the distribution characteristics and demographic factors of traditional Chinese medicine(TCM)constitution among elderly individuals in China.Methods: Elderly individuals from seven regions in China were selected as samples in this study using a multistage cluster random sampling method.The basic information questionnaire and Constitution in Chinese Medicine Questionnaire(Elderly Edition)were used.Descriptive statistical analysis,chi-squared tests,and binary logistic regression analysis were used.Results: The single balanced constitution(BC)accounted for 23.9%.The results of the major TCM constitution types showed that BC(43.2%)accounted for the largest proportion and unbalanced constitutions ranged from 0.9%to 15.7%.East China region(odds ratio[OR]=2.097;95%confidence interval[CI],1.912 to 2.301),married status(OR=1.341;95%CI,1.235 to 1.457),and managers(OR=1.254;95%CI,1.044 to 1.505)were significantly associated with BC.Age>70 years was associated with qi-deficiency constitution and blood stasis constitution(BSC).Female sex was significantly associated with yang-deficiency constitution(OR=1.646;95%CI,1.52 to 1.782).Southwest region was significantly associated with phlegm-dampness constitution(OR=1.809;95%CI,1.569 to 2.086).North China region was significantly associated with inherited special constitution(OR=2.521;95%CI,1.569 to 4.05).South China region(OR=2.741;95%CI,1.997 to 1.3.763),Central China region(OR=8.889;95%CI,6.676 to 11.835),senior middle school education(OR=2.442;95%CI,1.932 to 3.088),and managers(OR=1.804;95%CI,1.21 to 2.69)were significantly associated with BSC.Conclusions: This study defined the distribution characteristics and demographic factors of TCM constitution in the elderly population.Adjusting and improving unbalanced constitutions,which are correlated with diseases,can help promote healthy aging through the scientific management of these demographic factors.展开更多
The aim of this paper is to compare sample quality across two probability samples and one that uses probabilistic cluster sampling combined with random route and quota sampling within the selected clusters in order to...The aim of this paper is to compare sample quality across two probability samples and one that uses probabilistic cluster sampling combined with random route and quota sampling within the selected clusters in order to define the ultimate survey units. All of them use the face-to-face interview as the survey procedure. The hypothesis to be tested is that it is possible to achieve the same degree of representativeness using a combination of random route sampling and quota sampling (with substitution) as it can be achieved by means of household sampling (without substitution) based on the municipal register of inhabitants. We have found such marked differences in the age and gender distribution of the probability sampling, where the deviations exceed 6%. A different picture emerges when it comes to comparing the employment variables, where the quota sampling overestimates the economic activity rate (2.5%) and the unemployment rate (8%) and underestimates the employment rate (3.46%).展开更多
In general the accuracy of mean estimator can be improved by stratified random sampling. In this paper, we provide an idea different from empirical methods that the accuracy can be more improved through bootstrap resa...In general the accuracy of mean estimator can be improved by stratified random sampling. In this paper, we provide an idea different from empirical methods that the accuracy can be more improved through bootstrap resampling method under some conditions. The determination of sample size by bootstrap method is also discussed, and a simulation is made to verify the accuracy of the proposed method. The simulation results show that the sample size based on bootstrapping is smaller than that based on central limit theorem.展开更多
One of the key assumptions in respondent-driven sampling (RDS) analysis, called “random selection assumption,” is that respondents randomly recruit their peers from their personal networks. The objective of this stu...One of the key assumptions in respondent-driven sampling (RDS) analysis, called “random selection assumption,” is that respondents randomly recruit their peers from their personal networks. The objective of this study was to verify this assumption in the empirical data of egocentric networks. Methods: We conducted an egocentric network study among young drug users in China, in which RDS was used to recruit this hard-to-reach population. If the random recruitment assumption holds, the RDS-estimated population proportions should be similar to the actual population proportions. Following this logic, we first calculated the population proportions of five visible variables (gender, age, education, marital status, and drug use mode) among the total drug-use alters from which the RDS sample was drawn, and then estimated the RDS-adjusted population proportions and their 95% confidence intervals in the RDS sample. Theoretically, if the random recruitment assumption holds, the 95% confidence intervals estimated in the RDS sample should include the population proportions calculated in the total drug-use alters. Results: The evaluation of the RDS sample indicated its success in reaching the convergence of RDS compositions and including a broad cross-section of the hidden population. Findings demonstrate that the random selection assumption holds for three group traits, but not for two others. Specifically, egos randomly recruited subjects in different age groups, marital status, or drug use modes from their network alters, but not in gender and education levels. Conclusions: This study demonstrates the occurrence of non-random recruitment, indicating that the recruitment of subjects in this RDS study was not completely at random. Future studies are needed to assess the extent to which the population proportion estimates can be biased when the violation of the assumption occurs in some group traits in RDS samples.展开更多
The objectives of this paper are to demonstrate the algorithms employed by three statistical software programs (R, Real Statistics using Excel, and SPSS) for calculating the exact two-tailed probability of the Wald-Wo...The objectives of this paper are to demonstrate the algorithms employed by three statistical software programs (R, Real Statistics using Excel, and SPSS) for calculating the exact two-tailed probability of the Wald-Wolfowitz one-sample runs test for randomness, to present a novel approach for computing this probability, and to compare the four procedures by generating samples of 10 and 11 data points, varying the parameters n<sub>0</sub> (number of zeros) and n<sub>1</sub> (number of ones), as well as the number of runs. Fifty-nine samples are created to replicate the behavior of the distribution of the number of runs with 10 and 11 data points. The exact two-tailed probabilities for the four procedures were compared using Friedman’s test. Given the significant difference in central tendency, post-hoc comparisons were conducted using Conover’s test with Benjamini-Yekutielli correction. It is concluded that the procedures of Real Statistics using Excel and R exhibit some inadequacies in the calculation of the exact two-tailed probability, whereas the new proposal and the SPSS procedure are deemed more suitable. The proposed robust algorithm has a more transparent rationale than the SPSS one, albeit being somewhat more conservative. We recommend its implementation for this test and its application to others, such as the binomial and sign test.展开更多
In this paper, we propose a software component under Windows that generates pseudo random numbers using RDS (Refined Descriptive Sampling) as required by the simulation. RDS is regarded as the best sampling method a...In this paper, we propose a software component under Windows that generates pseudo random numbers using RDS (Refined Descriptive Sampling) as required by the simulation. RDS is regarded as the best sampling method as shown in the literature. In order to validate the proposed component, its implementation is proposed on approximating integrals. The simulation results from RDS using "RDSRnd" generator were compared to those obtained using the generator "Rnd" included in the Pascal programming language under Windows. The best results are given by the proposed software component.展开更多
In this study we have proposed a modified ratio type estimator for population variance of the study variable y under simple random sampling without replacement making use of coefficient of kurtosis and median of an au...In this study we have proposed a modified ratio type estimator for population variance of the study variable y under simple random sampling without replacement making use of coefficient of kurtosis and median of an auxiliary variable x. The estimator’s properties have been derived up to first order of Taylor’s series expansion. The efficiency conditions derived theoretically under which the proposed estimator performs better than existing estimators. Empirical studies have been done using real populations to demonstrate the performance of the developed estimator in comparison with the existing estimators. The proposed estimator as illustrated by the empirical studies performs better than the existing estimators under some specified conditions i.e. it has the smallest Mean Squared Error and the highest Percentage Relative Efficiency. The developed estimator therefore is suitable to be applied to situations in which the variable of interest has a positive correlation with the auxiliary variable.展开更多
In this paper, auxiliary information is used to determine an estimator of finite population total using nonparametric regression under stratified random sampling. To achieve this, a model-based approach is adopted by ...In this paper, auxiliary information is used to determine an estimator of finite population total using nonparametric regression under stratified random sampling. To achieve this, a model-based approach is adopted by making use of the local polynomial regression estimation to predict the nonsampled values of the survey variable y. The performance of the proposed estimator is investigated against some design-based and model-based regression estimators. The simulation experiments show that the resulting estimator exhibits good properties. Generally, good confidence intervals are seen for the nonparametric regression estimators, and use of the proposed estimator leads to relatively smaller values of RE compared to other estimators.展开更多
In this paper, analysis of methodology was realized for the application of stratified random sampling with optimum allocation in the case of a subject of research which concerns the rural population and presents high ...In this paper, analysis of methodology was realized for the application of stratified random sampling with optimum allocation in the case of a subject of research which concerns the rural population and presents high differentiations among the three strata in which this population could be classified. The rural population of Evros Prefecture (Greece) with criterion the mean altitude of settlements was classified in three strata, the mountainous, semi-mountainous and fiat population for the estimation of mean consumption of forest fuelwood for covering of heating and cooking needs in households of these three strata. The analysis of this methodology includes: (1) the determination of total size of sample for entire the rural population and its allocation to the various strata; (2) the investigation of effectiveness of stratification with the technique of analysis of variance (One-Way ANOVA); (3) the conduct of sampling research with the realization of face-to-face interviews in selected households and (4) the control of forms of the questionnaire and the analysis of data by using the statistical package for social sciences, SPSS for Windows. All data for the analysis of this methodology and its practical application were taken by the pilot sampling which was realized in each stratum. Relative paper was not found by the review of literature.展开更多
In this paper, the problem of nonparametric estimation of finite population quantile function using multiplicative bias correction technique is considered. A robust estimator of the finite population quantile function...In this paper, the problem of nonparametric estimation of finite population quantile function using multiplicative bias correction technique is considered. A robust estimator of the finite population quantile function based on multiplicative bias correction is derived with the aid of a super population model. Most studies have concentrated on kernel smoothers in the estimation of regression functions. This technique has also been applied to various methods of non-parametric estimation of the finite population quantile already under review. A major problem with the use of nonparametric kernel-based regression over a finite interval, such as the estimation of finite population quantities, is bias at boundary points. By correcting the boundary problems associated with previous model-based estimators, the multiplicative bias corrected estimator produced better results in estimating the finite population quantile function. Furthermore, the asymptotic behavior of the proposed estimators </span><span style="font-family:Verdana;">is</span><span style="font-family:Verdana;"> presented</span><span style="font-family:Verdana;">. </span><span style="font-family:Verdana;">It is observed that the estimator is asymptotically unbiased and statistically consistent when certain conditions are satisfied. The simulation results show that the suggested estimator is quite well in terms of relative bias, mean squared error, and relative root mean error. As a result, the multiplicative bias corrected estimator is strongly suggested for survey sampling estimation of the finite population quantile function.展开更多
Srivastava and Jhajj [ 1 6] proposed a class of estimators for estimating population variance using multi auxiliary variables in simple random sampling and they utilized the means and variances of auxiliary variables....Srivastava and Jhajj [ 1 6] proposed a class of estimators for estimating population variance using multi auxiliary variables in simple random sampling and they utilized the means and variances of auxiliary variables. In this paper, we adapted this class and motivated by Searle [13], and we suggested more generalized class of estimators for estimating the population variance in simple random sampling. The expressions for the mean square error of proposed class have been derived in general form. Besides obtaining the minimized MSE of the proposed and adapted class, it is shown that the adapted classis the special case of the proposed class. Moreover, these theoretical findings are supported by an empirical study of original data.展开更多
Random sampling algorithm was proposed firstly by Schnorr in 2003 to find short lattice vectors,as an alternative to enumeration.The follow-up developments in random sampling were mainly proposed by Fukase and Kashiwa...Random sampling algorithm was proposed firstly by Schnorr in 2003 to find short lattice vectors,as an alternative to enumeration.The follow-up developments in random sampling were mainly proposed by Fukase and Kashiwabara in 2015 and Aono and Nguyen in 2017.Although they extended the sampling space compared to Schnorr's work through the natural number representation,they did not show how to sample specifically in practice and what vectors should be sampled,in order to find short enough lattice vectors.In this paper,the authors firstly introduce a practical random sampling algorithm under some reasonable assumptions which can find short enough lattice vectors efficiently.Then,as an application of this new random sampling algorithm,the authors show that it can improve the performance of progressive BKZ algorithm in practice.Finally,the authors solve the Darmstadt's Lattice Challenge and get a series of new records in the dimension from 500 to 825,using the improved progressive BKZ algorithm.展开更多
The quality of debris flow susceptibility mapping varies with sampling strategies. This paper aims at comparing three sampling strategies and determining the optimal one to sample the debris flow watersheds. The three...The quality of debris flow susceptibility mapping varies with sampling strategies. This paper aims at comparing three sampling strategies and determining the optimal one to sample the debris flow watersheds. The three sampling strategies studied were the centroid of the scarp area(COSA), the centroid of the flowing area(COFA), and the centroid of the accumulation area(COAA) of debris flow watersheds. An inventory consisting of 150 debris flow watersheds and 12 conditioning factors were prepared for research. Firstly, the information gain ratio(IGR) method was used to analyze the predictive ability of the conditioning factors. Subsequently, 12 conditioning factors were involved in the modeling of artificial neural network(ANN), random forest(RF) and support vector machine(SVM). Then, the receiver operating characteristic curves(ROC) and the area under curves(AUC) were used to evaluate the model performance. Finally, a scoring system was used to score the quality of the debris flow susceptibility maps. Samples obtained from the accumulation area have the strongest predictive ability and can make the models achieve the best performance. The AUC values corresponding to the best model performance on the validation dataset were 0.861, 0.804 and 0.856 for SVM, ANN and RF respectively. The sampling strategy of the centroid of the scarp area is optimal with the highest quality of debris flow susceptibility maps having scores of 373470, 393241 and 362485 for SVM, ANN and RF respectively.展开更多
The main aim of this study was to evaluate methods for fixed area and distance sampling in the Zagros open forest area in western Iran. Basic forest management and planning required appropriate quantitative and qualit...The main aim of this study was to evaluate methods for fixed area and distance sampling in the Zagros open forest area in western Iran. Basic forest management and planning required appropriate quantitative and qualitative information. Two sampling methods were compared on the basis of the actual means of characteristics derived from the 100 % survey. In total, 37 sampling plots were systematically installed with a grid of 100 m × 100 m in the study area. Density, crown canopy, and basal area of the stands were measured. The 100 % survey showed that tree density above 12.5 cm diameter at breast height was 68.04 stem ha-1, basal area was 15.16 m2 ha-1 and crown canopy percentage was 35.71% ha-1. The values for the traits determined by the two sampling methods differed significantly (P = 0.05). When the time required for the methods was compared, transect sampling required less than systematic-random sampling. Therefore, the transect sampling method was the more economical method for the Zagros open forests. The transect sampling method was statistically defensible and practical for quantitating characteristics of the Zagros open forests.展开更多
Direct measurement of snow water equivalent(SWE)in snow-dominated mountainous areas is difficult,thus its prediction is essential for water resources management in such areas.In addition,because of nonlinear trend of ...Direct measurement of snow water equivalent(SWE)in snow-dominated mountainous areas is difficult,thus its prediction is essential for water resources management in such areas.In addition,because of nonlinear trend of snow spatial distribution and the multiple influencing factors concerning the SWE spatial distribution,statistical models are not usually able to present acceptable results.Therefore,applicable methods that are able to predict nonlinear trends are necessary.In this research,to predict SWE,the Sohrevard Watershed located in northwest of Iran was selected as the case study.Database was collected,and the required maps were derived.Snow depth(SD)at 150 points with two sampling patterns including systematic random sampling and Latin hypercube sampling(LHS),and snow density at 18 points were randomly measured,and then SWE was calculated.SWE was predicted using artificial neural network(ANN),adaptive neuro-fuzzy inference system(ANFIS)and regression methods.The results showed that the performance of ANN and ANFIS models with two sampling patterns were observed better than the regression method.Moreover,based on most of the efficiency criteria,the efficiency of ANN,ANFIS and regression methods under LHS pattern were observed higher than the systematic random sampling pattern.However,there were no significant differences between the two methods of ANN and ANFIS in SWE prediction.Data of both two sampling patterns had the highest sensitivity to the elevation.In addition,the LHS and the systematic random sampling patterns had the least sensitivity to the profile curvature and plan curvature,respectively.展开更多
This article proposes two new Ranked Set Sampling(RSS)designs for estimating the population parameters:Simple Z Ranked Set Sampling(SZRSS)and Generalized Z Ranked Set Sampling(GZRSS).These designs provide unbiased est...This article proposes two new Ranked Set Sampling(RSS)designs for estimating the population parameters:Simple Z Ranked Set Sampling(SZRSS)and Generalized Z Ranked Set Sampling(GZRSS).These designs provide unbiased estimators for the mean of symmetric distributions.It is shown that for non-uniform symmetric distributions,the estimators of the mean under the suggested designs are more efcient than those obtained by RSS,Simple Random Sampling(SRS),extreme RSS and truncation based RSS designs.Also,the proposed RSS schemes outperform other RSS schemes and provide more efcient estimates than their competitors under imperfect rankings.The suggested mean estimators under perfect and imperfect rankings are more efcient than the linear regression estimator under SRS.Our proposed RSS designs are also extended to cover the estimation of the population median.Real data is used to examine wthe usefulness and efciency of our estimators.展开更多
In this paper, an importance sampling maximum likelihood(ISML) estimator for direction-of-arrival(DOA) of incoherently distributed(ID) sources is proposed. Starting from the maximum likelihood estimation description o...In this paper, an importance sampling maximum likelihood(ISML) estimator for direction-of-arrival(DOA) of incoherently distributed(ID) sources is proposed. Starting from the maximum likelihood estimation description of the uniform linear array(ULA), a decoupled concentrated likelihood function(CLF) is presented. A new objective function based on CLF which can obtain a closed-form solution of global maximum is constructed according to Pincus theorem. To obtain the optimal value of the objective function which is a complex high-dimensional integral,we propose an importance sampling approach based on Monte Carlo random calculation. Next, an importance function is derived, which can simplify the problem of generating random vector from a high-dimensional probability density function(PDF) to generate random variable from a one-dimensional PDF. Compared with the existing maximum likelihood(ML) algorithms for DOA estimation of ID sources, the proposed algorithm does not require initial estimates, and its performance is closer to CramerRao lower bound(CRLB). The proposed algorithm performs better than the existing methods when the interval between sources to be estimated is small and in low signal to noise ratio(SNR)scenarios.展开更多
文摘The aim of this study is to investigate the impacts of the sampling strategy of landslide and non-landslide on the performance of landslide susceptibility assessment(LSA).The study area is the Feiyun catchment in Wenzhou City,Southeast China.Two types of landslides samples,combined with seven non-landslide sampling strategies,resulted in a total of 14 scenarios.The corresponding landslide susceptibility map(LSM)for each scenario was generated using the random forest model.The receiver operating characteristic(ROC)curve and statistical indicators were calculated and used to assess the impact of the dataset sampling strategy.The results showed that higher accuracies were achieved when using the landslide core as positive samples,combined with non-landslide sampling from the very low zone or buffer zone.The results reveal the influence of landslide and non-landslide sampling strategies on the accuracy of LSA,which provides a reference for subsequent researchers aiming to obtain a more reasonable LSM.
文摘A composite random variable is a product (or sum of products) of statistically distributed quantities. Such a variable can represent the solution to a multi-factor quantitative problem submitted to a large, diverse, independent, anonymous group of non-expert respondents (the “crowd”). The objective of this research is to examine the statistical distribution of solutions from a large crowd to a quantitative problem involving image analysis and object counting. Theoretical analysis by the author, covering a range of conditions and types of factor variables, predicts that composite random variables are distributed log-normally to an excellent approximation. If the factors in a problem are themselves distributed log-normally, then their product is rigorously log-normal. A crowdsourcing experiment devised by the author and implemented with the assistance of a BBC (British Broadcasting Corporation) television show, yielded a sample of approximately 2000 responses consistent with a log-normal distribution. The sample mean was within ~12% of the true count. However, a Monte Carlo simulation (MCS) of the experiment, employing either normal or log-normal random variables as factors to model the processes by which a crowd of 1 million might arrive at their estimates, resulted in a visually perfect log-normal distribution with a mean response within ~5% of the true count. The results of this research suggest that a well-modeled MCS, by simulating a sample of responses from a large, rational, and incentivized crowd, can provide a more accurate solution to a quantitative problem than might be attainable by direct sampling of a smaller crowd or an uninformed crowd, irrespective of size, that guesses randomly.
基金ProjectsupportedbytheNationalNaturalScienceFoundationofChina (No .40 1 71 0 78) ,FundfromHongKongPolytechnicUniversity (No.1 .34 .970 9)andtheResearchGrantsCouncilofHongKongSAR (No .3 ZB40 ) .
文摘On the basis of the principles of simple random sampling, the statistical model of rate of disfigurement (RD) is put forward and described in detail. According to the definition of simple random sampling for the attribute data in GIS, the mean and variance of the RD are deduced as the characteristic value of the statistical model in order to explain the feasibility of the accuracy measurement of the attribute data in GIS by using the RD. Moreover, on the basis of the mean and variance of the RD, the quality assessment method for attribute data of vector maps during the data collecting is discussed. The RD spread graph is also drawn to see whether the quality of the attribute data is under control. The RD model can synthetically judge the quality of attribute data, which is different from other measurement coefficients that only discuss accuracy of classification.
基金supported by the National Key R&D Program of China(2020YFC2003102).
文摘Objective:To reveal the distribution characteristics and demographic factors of traditional Chinese medicine(TCM)constitution among elderly individuals in China.Methods: Elderly individuals from seven regions in China were selected as samples in this study using a multistage cluster random sampling method.The basic information questionnaire and Constitution in Chinese Medicine Questionnaire(Elderly Edition)were used.Descriptive statistical analysis,chi-squared tests,and binary logistic regression analysis were used.Results: The single balanced constitution(BC)accounted for 23.9%.The results of the major TCM constitution types showed that BC(43.2%)accounted for the largest proportion and unbalanced constitutions ranged from 0.9%to 15.7%.East China region(odds ratio[OR]=2.097;95%confidence interval[CI],1.912 to 2.301),married status(OR=1.341;95%CI,1.235 to 1.457),and managers(OR=1.254;95%CI,1.044 to 1.505)were significantly associated with BC.Age>70 years was associated with qi-deficiency constitution and blood stasis constitution(BSC).Female sex was significantly associated with yang-deficiency constitution(OR=1.646;95%CI,1.52 to 1.782).Southwest region was significantly associated with phlegm-dampness constitution(OR=1.809;95%CI,1.569 to 2.086).North China region was significantly associated with inherited special constitution(OR=2.521;95%CI,1.569 to 4.05).South China region(OR=2.741;95%CI,1.997 to 1.3.763),Central China region(OR=8.889;95%CI,6.676 to 11.835),senior middle school education(OR=2.442;95%CI,1.932 to 3.088),and managers(OR=1.804;95%CI,1.21 to 2.69)were significantly associated with BSC.Conclusions: This study defined the distribution characteristics and demographic factors of TCM constitution in the elderly population.Adjusting and improving unbalanced constitutions,which are correlated with diseases,can help promote healthy aging through the scientific management of these demographic factors.
文摘The aim of this paper is to compare sample quality across two probability samples and one that uses probabilistic cluster sampling combined with random route and quota sampling within the selected clusters in order to define the ultimate survey units. All of them use the face-to-face interview as the survey procedure. The hypothesis to be tested is that it is possible to achieve the same degree of representativeness using a combination of random route sampling and quota sampling (with substitution) as it can be achieved by means of household sampling (without substitution) based on the municipal register of inhabitants. We have found such marked differences in the age and gender distribution of the probability sampling, where the deviations exceed 6%. A different picture emerges when it comes to comparing the employment variables, where the quota sampling overestimates the economic activity rate (2.5%) and the unemployment rate (8%) and underestimates the employment rate (3.46%).
基金The Science Research Start-up Foundation for Young Teachers of Southwest Jiaotong University(No.2007Q091)
文摘In general the accuracy of mean estimator can be improved by stratified random sampling. In this paper, we provide an idea different from empirical methods that the accuracy can be more improved through bootstrap resampling method under some conditions. The determination of sample size by bootstrap method is also discussed, and a simulation is made to verify the accuracy of the proposed method. The simulation results show that the sample size based on bootstrapping is smaller than that based on central limit theorem.
文摘One of the key assumptions in respondent-driven sampling (RDS) analysis, called “random selection assumption,” is that respondents randomly recruit their peers from their personal networks. The objective of this study was to verify this assumption in the empirical data of egocentric networks. Methods: We conducted an egocentric network study among young drug users in China, in which RDS was used to recruit this hard-to-reach population. If the random recruitment assumption holds, the RDS-estimated population proportions should be similar to the actual population proportions. Following this logic, we first calculated the population proportions of five visible variables (gender, age, education, marital status, and drug use mode) among the total drug-use alters from which the RDS sample was drawn, and then estimated the RDS-adjusted population proportions and their 95% confidence intervals in the RDS sample. Theoretically, if the random recruitment assumption holds, the 95% confidence intervals estimated in the RDS sample should include the population proportions calculated in the total drug-use alters. Results: The evaluation of the RDS sample indicated its success in reaching the convergence of RDS compositions and including a broad cross-section of the hidden population. Findings demonstrate that the random selection assumption holds for three group traits, but not for two others. Specifically, egos randomly recruited subjects in different age groups, marital status, or drug use modes from their network alters, but not in gender and education levels. Conclusions: This study demonstrates the occurrence of non-random recruitment, indicating that the recruitment of subjects in this RDS study was not completely at random. Future studies are needed to assess the extent to which the population proportion estimates can be biased when the violation of the assumption occurs in some group traits in RDS samples.
文摘The objectives of this paper are to demonstrate the algorithms employed by three statistical software programs (R, Real Statistics using Excel, and SPSS) for calculating the exact two-tailed probability of the Wald-Wolfowitz one-sample runs test for randomness, to present a novel approach for computing this probability, and to compare the four procedures by generating samples of 10 and 11 data points, varying the parameters n<sub>0</sub> (number of zeros) and n<sub>1</sub> (number of ones), as well as the number of runs. Fifty-nine samples are created to replicate the behavior of the distribution of the number of runs with 10 and 11 data points. The exact two-tailed probabilities for the four procedures were compared using Friedman’s test. Given the significant difference in central tendency, post-hoc comparisons were conducted using Conover’s test with Benjamini-Yekutielli correction. It is concluded that the procedures of Real Statistics using Excel and R exhibit some inadequacies in the calculation of the exact two-tailed probability, whereas the new proposal and the SPSS procedure are deemed more suitable. The proposed robust algorithm has a more transparent rationale than the SPSS one, albeit being somewhat more conservative. We recommend its implementation for this test and its application to others, such as the binomial and sign test.
文摘In this paper, we propose a software component under Windows that generates pseudo random numbers using RDS (Refined Descriptive Sampling) as required by the simulation. RDS is regarded as the best sampling method as shown in the literature. In order to validate the proposed component, its implementation is proposed on approximating integrals. The simulation results from RDS using "RDSRnd" generator were compared to those obtained using the generator "Rnd" included in the Pascal programming language under Windows. The best results are given by the proposed software component.
文摘In this study we have proposed a modified ratio type estimator for population variance of the study variable y under simple random sampling without replacement making use of coefficient of kurtosis and median of an auxiliary variable x. The estimator’s properties have been derived up to first order of Taylor’s series expansion. The efficiency conditions derived theoretically under which the proposed estimator performs better than existing estimators. Empirical studies have been done using real populations to demonstrate the performance of the developed estimator in comparison with the existing estimators. The proposed estimator as illustrated by the empirical studies performs better than the existing estimators under some specified conditions i.e. it has the smallest Mean Squared Error and the highest Percentage Relative Efficiency. The developed estimator therefore is suitable to be applied to situations in which the variable of interest has a positive correlation with the auxiliary variable.
文摘In this paper, auxiliary information is used to determine an estimator of finite population total using nonparametric regression under stratified random sampling. To achieve this, a model-based approach is adopted by making use of the local polynomial regression estimation to predict the nonsampled values of the survey variable y. The performance of the proposed estimator is investigated against some design-based and model-based regression estimators. The simulation experiments show that the resulting estimator exhibits good properties. Generally, good confidence intervals are seen for the nonparametric regression estimators, and use of the proposed estimator leads to relatively smaller values of RE compared to other estimators.
文摘In this paper, analysis of methodology was realized for the application of stratified random sampling with optimum allocation in the case of a subject of research which concerns the rural population and presents high differentiations among the three strata in which this population could be classified. The rural population of Evros Prefecture (Greece) with criterion the mean altitude of settlements was classified in three strata, the mountainous, semi-mountainous and fiat population for the estimation of mean consumption of forest fuelwood for covering of heating and cooking needs in households of these three strata. The analysis of this methodology includes: (1) the determination of total size of sample for entire the rural population and its allocation to the various strata; (2) the investigation of effectiveness of stratification with the technique of analysis of variance (One-Way ANOVA); (3) the conduct of sampling research with the realization of face-to-face interviews in selected households and (4) the control of forms of the questionnaire and the analysis of data by using the statistical package for social sciences, SPSS for Windows. All data for the analysis of this methodology and its practical application were taken by the pilot sampling which was realized in each stratum. Relative paper was not found by the review of literature.
文摘In this paper, the problem of nonparametric estimation of finite population quantile function using multiplicative bias correction technique is considered. A robust estimator of the finite population quantile function based on multiplicative bias correction is derived with the aid of a super population model. Most studies have concentrated on kernel smoothers in the estimation of regression functions. This technique has also been applied to various methods of non-parametric estimation of the finite population quantile already under review. A major problem with the use of nonparametric kernel-based regression over a finite interval, such as the estimation of finite population quantities, is bias at boundary points. By correcting the boundary problems associated with previous model-based estimators, the multiplicative bias corrected estimator produced better results in estimating the finite population quantile function. Furthermore, the asymptotic behavior of the proposed estimators </span><span style="font-family:Verdana;">is</span><span style="font-family:Verdana;"> presented</span><span style="font-family:Verdana;">. </span><span style="font-family:Verdana;">It is observed that the estimator is asymptotically unbiased and statistically consistent when certain conditions are satisfied. The simulation results show that the suggested estimator is quite well in terms of relative bias, mean squared error, and relative root mean error. As a result, the multiplicative bias corrected estimator is strongly suggested for survey sampling estimation of the finite population quantile function.
文摘Srivastava and Jhajj [ 1 6] proposed a class of estimators for estimating population variance using multi auxiliary variables in simple random sampling and they utilized the means and variances of auxiliary variables. In this paper, we adapted this class and motivated by Searle [13], and we suggested more generalized class of estimators for estimating the population variance in simple random sampling. The expressions for the mean square error of proposed class have been derived in general form. Besides obtaining the minimized MSE of the proposed and adapted class, it is shown that the adapted classis the special case of the proposed class. Moreover, these theoretical findings are supported by an empirical study of original data.
基金supported by the National Natural Science Foundation of China under Grant Nos.62032009 and 62102440。
文摘Random sampling algorithm was proposed firstly by Schnorr in 2003 to find short lattice vectors,as an alternative to enumeration.The follow-up developments in random sampling were mainly proposed by Fukase and Kashiwabara in 2015 and Aono and Nguyen in 2017.Although they extended the sampling space compared to Schnorr's work through the natural number representation,they did not show how to sample specifically in practice and what vectors should be sampled,in order to find short enough lattice vectors.In this paper,the authors firstly introduce a practical random sampling algorithm under some reasonable assumptions which can find short enough lattice vectors efficiently.Then,as an application of this new random sampling algorithm,the authors show that it can improve the performance of progressive BKZ algorithm in practice.Finally,the authors solve the Darmstadt's Lattice Challenge and get a series of new records in the dimension from 500 to 825,using the improved progressive BKZ algorithm.
基金This work was supported by National Natural Science Foundation of China(Grant no.41972267 and no.41572257)Graduate Innovation Fund of Jilin University(Grant no.101832020CX232)。
文摘The quality of debris flow susceptibility mapping varies with sampling strategies. This paper aims at comparing three sampling strategies and determining the optimal one to sample the debris flow watersheds. The three sampling strategies studied were the centroid of the scarp area(COSA), the centroid of the flowing area(COFA), and the centroid of the accumulation area(COAA) of debris flow watersheds. An inventory consisting of 150 debris flow watersheds and 12 conditioning factors were prepared for research. Firstly, the information gain ratio(IGR) method was used to analyze the predictive ability of the conditioning factors. Subsequently, 12 conditioning factors were involved in the modeling of artificial neural network(ANN), random forest(RF) and support vector machine(SVM). Then, the receiver operating characteristic curves(ROC) and the area under curves(AUC) were used to evaluate the model performance. Finally, a scoring system was used to score the quality of the debris flow susceptibility maps. Samples obtained from the accumulation area have the strongest predictive ability and can make the models achieve the best performance. The AUC values corresponding to the best model performance on the validation dataset were 0.861, 0.804 and 0.856 for SVM, ANN and RF respectively. The sampling strategy of the centroid of the scarp area is optimal with the highest quality of debris flow susceptibility maps having scores of 373470, 393241 and 362485 for SVM, ANN and RF respectively.
文摘The main aim of this study was to evaluate methods for fixed area and distance sampling in the Zagros open forest area in western Iran. Basic forest management and planning required appropriate quantitative and qualitative information. Two sampling methods were compared on the basis of the actual means of characteristics derived from the 100 % survey. In total, 37 sampling plots were systematically installed with a grid of 100 m × 100 m in the study area. Density, crown canopy, and basal area of the stands were measured. The 100 % survey showed that tree density above 12.5 cm diameter at breast height was 68.04 stem ha-1, basal area was 15.16 m2 ha-1 and crown canopy percentage was 35.71% ha-1. The values for the traits determined by the two sampling methods differed significantly (P = 0.05). When the time required for the methods was compared, transect sampling required less than systematic-random sampling. Therefore, the transect sampling method was the more economical method for the Zagros open forests. The transect sampling method was statistically defensible and practical for quantitating characteristics of the Zagros open forests.
文摘Direct measurement of snow water equivalent(SWE)in snow-dominated mountainous areas is difficult,thus its prediction is essential for water resources management in such areas.In addition,because of nonlinear trend of snow spatial distribution and the multiple influencing factors concerning the SWE spatial distribution,statistical models are not usually able to present acceptable results.Therefore,applicable methods that are able to predict nonlinear trends are necessary.In this research,to predict SWE,the Sohrevard Watershed located in northwest of Iran was selected as the case study.Database was collected,and the required maps were derived.Snow depth(SD)at 150 points with two sampling patterns including systematic random sampling and Latin hypercube sampling(LHS),and snow density at 18 points were randomly measured,and then SWE was calculated.SWE was predicted using artificial neural network(ANN),adaptive neuro-fuzzy inference system(ANFIS)and regression methods.The results showed that the performance of ANN and ANFIS models with two sampling patterns were observed better than the regression method.Moreover,based on most of the efficiency criteria,the efficiency of ANN,ANFIS and regression methods under LHS pattern were observed higher than the systematic random sampling pattern.However,there were no significant differences between the two methods of ANN and ANFIS in SWE prediction.Data of both two sampling patterns had the highest sensitivity to the elevation.In addition,the LHS and the systematic random sampling patterns had the least sensitivity to the profile curvature and plan curvature,respectively.
基金The authors extend their appreciation to Deanship of Scientic Research at King Khalid University for funding this work through Research Groups Program under Grant No.R.G.P.2/68/41.I.M.A.and A.I.A.received the grant.
文摘This article proposes two new Ranked Set Sampling(RSS)designs for estimating the population parameters:Simple Z Ranked Set Sampling(SZRSS)and Generalized Z Ranked Set Sampling(GZRSS).These designs provide unbiased estimators for the mean of symmetric distributions.It is shown that for non-uniform symmetric distributions,the estimators of the mean under the suggested designs are more efcient than those obtained by RSS,Simple Random Sampling(SRS),extreme RSS and truncation based RSS designs.Also,the proposed RSS schemes outperform other RSS schemes and provide more efcient estimates than their competitors under imperfect rankings.The suggested mean estimators under perfect and imperfect rankings are more efcient than the linear regression estimator under SRS.Our proposed RSS designs are also extended to cover the estimation of the population median.Real data is used to examine wthe usefulness and efciency of our estimators.
基金supported by the basic research program of Natural Science in Shannxi province of China (2021JQ-369)。
文摘In this paper, an importance sampling maximum likelihood(ISML) estimator for direction-of-arrival(DOA) of incoherently distributed(ID) sources is proposed. Starting from the maximum likelihood estimation description of the uniform linear array(ULA), a decoupled concentrated likelihood function(CLF) is presented. A new objective function based on CLF which can obtain a closed-form solution of global maximum is constructed according to Pincus theorem. To obtain the optimal value of the objective function which is a complex high-dimensional integral,we propose an importance sampling approach based on Monte Carlo random calculation. Next, an importance function is derived, which can simplify the problem of generating random vector from a high-dimensional probability density function(PDF) to generate random variable from a one-dimensional PDF. Compared with the existing maximum likelihood(ML) algorithms for DOA estimation of ID sources, the proposed algorithm does not require initial estimates, and its performance is closer to CramerRao lower bound(CRLB). The proposed algorithm performs better than the existing methods when the interval between sources to be estimated is small and in low signal to noise ratio(SNR)scenarios.