On the basis of the principles of simple random sampling, the statistical model of rate of disfigurement (RD) is put forward and described in detail. According to the definition of simple random sampling for the attri...On the basis of the principles of simple random sampling, the statistical model of rate of disfigurement (RD) is put forward and described in detail. According to the definition of simple random sampling for the attribute data in GIS, the mean and variance of the RD are deduced as the characteristic value of the statistical model in order to explain the feasibility of the accuracy measurement of the attribute data in GIS by using the RD. Moreover, on the basis of the mean and variance of the RD, the quality assessment method for attribute data of vector maps during the data collecting is discussed. The RD spread graph is also drawn to see whether the quality of the attribute data is under control. The RD model can synthetically judge the quality of attribute data, which is different from other measurement coefficients that only discuss accuracy of classification.展开更多
A composite random variable is a product (or sum of products) of statistically distributed quantities. Such a variable can represent the solution to a multi-factor quantitative problem submitted to a large, diverse, i...A composite random variable is a product (or sum of products) of statistically distributed quantities. Such a variable can represent the solution to a multi-factor quantitative problem submitted to a large, diverse, independent, anonymous group of non-expert respondents (the “crowd”). The objective of this research is to examine the statistical distribution of solutions from a large crowd to a quantitative problem involving image analysis and object counting. Theoretical analysis by the author, covering a range of conditions and types of factor variables, predicts that composite random variables are distributed log-normally to an excellent approximation. If the factors in a problem are themselves distributed log-normally, then their product is rigorously log-normal. A crowdsourcing experiment devised by the author and implemented with the assistance of a BBC (British Broadcasting Corporation) television show, yielded a sample of approximately 2000 responses consistent with a log-normal distribution. The sample mean was within ~12% of the true count. However, a Monte Carlo simulation (MCS) of the experiment, employing either normal or log-normal random variables as factors to model the processes by which a crowd of 1 million might arrive at their estimates, resulted in a visually perfect log-normal distribution with a mean response within ~5% of the true count. The results of this research suggest that a well-modeled MCS, by simulating a sample of responses from a large, rational, and incentivized crowd, can provide a more accurate solution to a quantitative problem than might be attainable by direct sampling of a smaller crowd or an uninformed crowd, irrespective of size, that guesses randomly.展开更多
In this work empirical models describing sampling error (Δ) are reported based upon analytical findings elicited from 3 common probability density functions (PDF): the Gaussian, representing any real-valued, ...In this work empirical models describing sampling error (Δ) are reported based upon analytical findings elicited from 3 common probability density functions (PDF): the Gaussian, representing any real-valued, randomly changing variable x of mean μ?and standard deviation σthe Poisson, representing counting data: i.e., any integral-valued entity’s count of x (cells, clumps of cells or colony forming units, molecules, mutations, etc.) per tested volume, area, length of time, etc. with population mean of μ?and;binomial data representing the number of successful occurrences of something (x+) out of n observations or sub-samplings. These data were generated in such a way as to simulate what should be observed in practice but avoid other forms of experimental error. Based upon analyses of 104 Δ?measurements, we show that the average Δ?() is proportional to ?(σx•μ-1;Gaussian) or ?(Poisson & binomial). The average proportionality constants associated with these disparate populations were also nearly identical (;±s). However, since ?for any Poisson process, . In a similar vein, we have empirically demonstrated that binomial-associated ?were also proportional to σx•μ-1. Furthermore, we established that, when all ?were plotted against either ?or σx•μ-1, there was only one relationship with a slope = A (0.767 ± 0.0990) and a near-zero intercept. This latter finding also argues that all , regardless of parent PDF, are proportional to σx•μ-1?which is the coefficient of variation for a population of sample means (). Lastly, we establish that the proportionality constant A is equivalent to the coefficient of variation associated with Δ?() measurement and, therefore, . These results are noteworthy inasmuch as they provide a straightforward empirical link between stochastic sampling error and the aforementioned Cvs. Finally, we demonstrate that all attendant empirical measures of Δ?are reasonably small (e.g., ) when an environmental microbiome was well-sampled: n = 16 - 18 observations with μ∼3?isolates per observation. These colony counting results were supported by the fact that the two major isolates’ relative abundance was reproducible in the four most probable composition observations from one common population.展开更多
The aim of this study is to investigate the impacts of the sampling strategy of landslide and non-landslide on the performance of landslide susceptibility assessment(LSA).The study area is the Feiyun catchment in Wenz...The aim of this study is to investigate the impacts of the sampling strategy of landslide and non-landslide on the performance of landslide susceptibility assessment(LSA).The study area is the Feiyun catchment in Wenzhou City,Southeast China.Two types of landslides samples,combined with seven non-landslide sampling strategies,resulted in a total of 14 scenarios.The corresponding landslide susceptibility map(LSM)for each scenario was generated using the random forest model.The receiver operating characteristic(ROC)curve and statistical indicators were calculated and used to assess the impact of the dataset sampling strategy.The results showed that higher accuracies were achieved when using the landslide core as positive samples,combined with non-landslide sampling from the very low zone or buffer zone.The results reveal the influence of landslide and non-landslide sampling strategies on the accuracy of LSA,which provides a reference for subsequent researchers aiming to obtain a more reasonable LSM.展开更多
In general the accuracy of mean estimator can be improved by stratified random sampling. In this paper, we provide an idea different from empirical methods that the accuracy can be more improved through bootstrap resa...In general the accuracy of mean estimator can be improved by stratified random sampling. In this paper, we provide an idea different from empirical methods that the accuracy can be more improved through bootstrap resampling method under some conditions. The determination of sample size by bootstrap method is also discussed, and a simulation is made to verify the accuracy of the proposed method. The simulation results show that the sample size based on bootstrapping is smaller than that based on central limit theorem.展开更多
In this paper, analysis of methodology was realized for the application of stratified random sampling with optimum allocation in the case of a subject of research which concerns the rural population and presents high ...In this paper, analysis of methodology was realized for the application of stratified random sampling with optimum allocation in the case of a subject of research which concerns the rural population and presents high differentiations among the three strata in which this population could be classified. The rural population of Evros Prefecture (Greece) with criterion the mean altitude of settlements was classified in three strata, the mountainous, semi-mountainous and fiat population for the estimation of mean consumption of forest fuelwood for covering of heating and cooking needs in households of these three strata. The analysis of this methodology includes: (1) the determination of total size of sample for entire the rural population and its allocation to the various strata; (2) the investigation of effectiveness of stratification with the technique of analysis of variance (One-Way ANOVA); (3) the conduct of sampling research with the realization of face-to-face interviews in selected households and (4) the control of forms of the questionnaire and the analysis of data by using the statistical package for social sciences, SPSS for Windows. All data for the analysis of this methodology and its practical application were taken by the pilot sampling which was realized in each stratum. Relative paper was not found by the review of literature.展开更多
In this paper, we propose a software component under Windows that generates pseudo random numbers using RDS (Refined Descriptive Sampling) as required by the simulation. RDS is regarded as the best sampling method a...In this paper, we propose a software component under Windows that generates pseudo random numbers using RDS (Refined Descriptive Sampling) as required by the simulation. RDS is regarded as the best sampling method as shown in the literature. In order to validate the proposed component, its implementation is proposed on approximating integrals. The simulation results from RDS using "RDSRnd" generator were compared to those obtained using the generator "Rnd" included in the Pascal programming language under Windows. The best results are given by the proposed software component.展开更多
In this paper, auxiliary information is used to determine an estimator of finite population total using nonparametric regression under stratified random sampling. To achieve this, a model-based approach is adopted by ...In this paper, auxiliary information is used to determine an estimator of finite population total using nonparametric regression under stratified random sampling. To achieve this, a model-based approach is adopted by making use of the local polynomial regression estimation to predict the nonsampled values of the survey variable y. The performance of the proposed estimator is investigated against some design-based and model-based regression estimators. The simulation experiments show that the resulting estimator exhibits good properties. Generally, good confidence intervals are seen for the nonparametric regression estimators, and use of the proposed estimator leads to relatively smaller values of RE compared to other estimators.展开更多
The aim of this paper is to compare sample quality across two probability samples and one that uses probabilistic cluster sampling combined with random route and quota sampling within the selected clusters in order to...The aim of this paper is to compare sample quality across two probability samples and one that uses probabilistic cluster sampling combined with random route and quota sampling within the selected clusters in order to define the ultimate survey units. All of them use the face-to-face interview as the survey procedure. The hypothesis to be tested is that it is possible to achieve the same degree of representativeness using a combination of random route sampling and quota sampling (with substitution) as it can be achieved by means of household sampling (without substitution) based on the municipal register of inhabitants. We have found such marked differences in the age and gender distribution of the probability sampling, where the deviations exceed 6%. A different picture emerges when it comes to comparing the employment variables, where the quota sampling overestimates the economic activity rate (2.5%) and the unemployment rate (8%) and underestimates the employment rate (3.46%).展开更多
Srivastava and Jhajj [ 1 6] proposed a class of estimators for estimating population variance using multi auxiliary variables in simple random sampling and they utilized the means and variances of auxiliary variables....Srivastava and Jhajj [ 1 6] proposed a class of estimators for estimating population variance using multi auxiliary variables in simple random sampling and they utilized the means and variances of auxiliary variables. In this paper, we adapted this class and motivated by Searle [13], and we suggested more generalized class of estimators for estimating the population variance in simple random sampling. The expressions for the mean square error of proposed class have been derived in general form. Besides obtaining the minimized MSE of the proposed and adapted class, it is shown that the adapted classis the special case of the proposed class. Moreover, these theoretical findings are supported by an empirical study of original data.展开更多
In this paper, the problem of nonparametric estimation of finite population quantile function using multiplicative bias correction technique is considered. A robust estimator of the finite population quantile function...In this paper, the problem of nonparametric estimation of finite population quantile function using multiplicative bias correction technique is considered. A robust estimator of the finite population quantile function based on multiplicative bias correction is derived with the aid of a super population model. Most studies have concentrated on kernel smoothers in the estimation of regression functions. This technique has also been applied to various methods of non-parametric estimation of the finite population quantile already under review. A major problem with the use of nonparametric kernel-based regression over a finite interval, such as the estimation of finite population quantities, is bias at boundary points. By correcting the boundary problems associated with previous model-based estimators, the multiplicative bias corrected estimator produced better results in estimating the finite population quantile function. Furthermore, the asymptotic behavior of the proposed estimators </span><span style="font-family:Verdana;">is</span><span style="font-family:Verdana;"> presented</span><span style="font-family:Verdana;">. </span><span style="font-family:Verdana;">It is observed that the estimator is asymptotically unbiased and statistically consistent when certain conditions are satisfied. The simulation results show that the suggested estimator is quite well in terms of relative bias, mean squared error, and relative root mean error. As a result, the multiplicative bias corrected estimator is strongly suggested for survey sampling estimation of the finite population quantile function.展开更多
One of the key assumptions in respondent-driven sampling (RDS) analysis, called “random selection assumption,” is that respondents randomly recruit their peers from their personal networks. The objective of this stu...One of the key assumptions in respondent-driven sampling (RDS) analysis, called “random selection assumption,” is that respondents randomly recruit their peers from their personal networks. The objective of this study was to verify this assumption in the empirical data of egocentric networks. Methods: We conducted an egocentric network study among young drug users in China, in which RDS was used to recruit this hard-to-reach population. If the random recruitment assumption holds, the RDS-estimated population proportions should be similar to the actual population proportions. Following this logic, we first calculated the population proportions of five visible variables (gender, age, education, marital status, and drug use mode) among the total drug-use alters from which the RDS sample was drawn, and then estimated the RDS-adjusted population proportions and their 95% confidence intervals in the RDS sample. Theoretically, if the random recruitment assumption holds, the 95% confidence intervals estimated in the RDS sample should include the population proportions calculated in the total drug-use alters. Results: The evaluation of the RDS sample indicated its success in reaching the convergence of RDS compositions and including a broad cross-section of the hidden population. Findings demonstrate that the random selection assumption holds for three group traits, but not for two others. Specifically, egos randomly recruited subjects in different age groups, marital status, or drug use modes from their network alters, but not in gender and education levels. Conclusions: This study demonstrates the occurrence of non-random recruitment, indicating that the recruitment of subjects in this RDS study was not completely at random. Future studies are needed to assess the extent to which the population proportion estimates can be biased when the violation of the assumption occurs in some group traits in RDS samples.展开更多
In this study we have proposed a modified ratio type estimator for population variance of the study variable y under simple random sampling without replacement making use of coefficient of kurtosis and median of an au...In this study we have proposed a modified ratio type estimator for population variance of the study variable y under simple random sampling without replacement making use of coefficient of kurtosis and median of an auxiliary variable x. The estimator’s properties have been derived up to first order of Taylor’s series expansion. The efficiency conditions derived theoretically under which the proposed estimator performs better than existing estimators. Empirical studies have been done using real populations to demonstrate the performance of the developed estimator in comparison with the existing estimators. The proposed estimator as illustrated by the empirical studies performs better than the existing estimators under some specified conditions i.e. it has the smallest Mean Squared Error and the highest Percentage Relative Efficiency. The developed estimator therefore is suitable to be applied to situations in which the variable of interest has a positive correlation with the auxiliary variable.展开更多
Non-response is a regular occurrence in Sample Surveys. Developing estimators when non-response exists may result in large biases when estimating population parameters. In this paper, a finite population mean is estim...Non-response is a regular occurrence in Sample Surveys. Developing estimators when non-response exists may result in large biases when estimating population parameters. In this paper, a finite population mean is estimated when non-response exists randomly under two stage cluster sampling with replacement. It is assumed that non-response arises in the survey variable in the second stage of cluster sampling. Weighting method of compensating for non-response is applied. Asymptotic properties of the proposed estimator of the population mean are derived. Under mild assumptions, the estimator is shown to be asymptotically consistent.展开更多
Objective:To reveal the distribution characteristics and demographic factors of traditional Chinese medicine(TCM)constitution among elderly individuals in China.Methods: Elderly individuals from seven regions in China...Objective:To reveal the distribution characteristics and demographic factors of traditional Chinese medicine(TCM)constitution among elderly individuals in China.Methods: Elderly individuals from seven regions in China were selected as samples in this study using a multistage cluster random sampling method.The basic information questionnaire and Constitution in Chinese Medicine Questionnaire(Elderly Edition)were used.Descriptive statistical analysis,chi-squared tests,and binary logistic regression analysis were used.Results: The single balanced constitution(BC)accounted for 23.9%.The results of the major TCM constitution types showed that BC(43.2%)accounted for the largest proportion and unbalanced constitutions ranged from 0.9%to 15.7%.East China region(odds ratio[OR]=2.097;95%confidence interval[CI],1.912 to 2.301),married status(OR=1.341;95%CI,1.235 to 1.457),and managers(OR=1.254;95%CI,1.044 to 1.505)were significantly associated with BC.Age>70 years was associated with qi-deficiency constitution and blood stasis constitution(BSC).Female sex was significantly associated with yang-deficiency constitution(OR=1.646;95%CI,1.52 to 1.782).Southwest region was significantly associated with phlegm-dampness constitution(OR=1.809;95%CI,1.569 to 2.086).North China region was significantly associated with inherited special constitution(OR=2.521;95%CI,1.569 to 4.05).South China region(OR=2.741;95%CI,1.997 to 1.3.763),Central China region(OR=8.889;95%CI,6.676 to 11.835),senior middle school education(OR=2.442;95%CI,1.932 to 3.088),and managers(OR=1.804;95%CI,1.21 to 2.69)were significantly associated with BSC.Conclusions: This study defined the distribution characteristics and demographic factors of TCM constitution in the elderly population.Adjusting and improving unbalanced constitutions,which are correlated with diseases,can help promote healthy aging through the scientific management of these demographic factors.展开更多
In the practical environment,it is very common for the simultaneous occurrence of base excitation and crosswind.Scavenging the combined energy of vibration and wind with a single energy harvesting structure is fascina...In the practical environment,it is very common for the simultaneous occurrence of base excitation and crosswind.Scavenging the combined energy of vibration and wind with a single energy harvesting structure is fascinating.For this purpose,the effects of the wind speed and random excitation level are investigated with the stochastic averaging method(SAM)based on the energy envelope.The results of the analytical prediction are verified with the Monte-Carlo method(MCM).The numerical simulation shows that the introduction of wind can reduce the critical excitation level for triggering an inter-well jump and make a bi-stable energy harvester(BEH)realize the performance enhancement for a weak base excitation.However,as the strength of the wind increases to a particular level,the influence of the random base excitation on the dynamic responses is weakened,and the system exhibits a periodic galloping response.A comparison between a BEH and a linear energy harvester(LEH)indicates that the BEH demonstrates inferior performance for high-speed wind.Relevant experiments are conducted to investigate the validity of the theoretical prediction and numerical simulation.The experimental findings also show that strong random excitation is favorable for the BEH in the range of low wind speeds.However,as the speed of the incoming wind is up to a particular level,the disadvantage of the BEH becomes clear and evident.展开更多
Global variance reduction is a bottleneck in Monte Carlo shielding calculations.The global variance reduction problem requires that the statistical error of the entire space is uniform.This study proposed a grid-AIS m...Global variance reduction is a bottleneck in Monte Carlo shielding calculations.The global variance reduction problem requires that the statistical error of the entire space is uniform.This study proposed a grid-AIS method for the global variance reduction problem based on the AIS method,which was implemented in the Monte Carlo program MCShield.The proposed method was validated using the VENUS-Ⅲ international benchmark problem and a self-shielding calculation example.The results from the VENUS-Ⅲ benchmark problem showed that the grid-AIS method achieved a significant reduction in the variance of the statistical errors of the MESH grids,decreasing from 1.08×10^(-2) to 3.84×10^(-3),representing a 64.00% reduction.This demonstrates that the grid-AIS method is effective in addressing global issues.The results of the selfshielding calculation demonstrate that the grid-AIS method produced accurate computational results.Moreover,the grid-AIS method exhibited a computational efficiency approximately one order of magnitude higher than that of the AIS method and approximately two orders of magnitude higher than that of the conventional Monte Carlo method.展开更多
Signals are often of random character since they cannot bear any information if they are predictable for any time t, they are usually modelled as stationary random processes .On the other hand, because of the inertia ...Signals are often of random character since they cannot bear any information if they are predictable for any time t, they are usually modelled as stationary random processes .On the other hand, because of the inertia of the measurement apparatus, measured sampled values obtained in practice may not be the precise value of the signal X(t) at time tk (k∈Z), but only local averages of X(t) near tk. In this paper, it is presented that a wide (or weak ) sense stationary stochastic process can be approximated by generalized sampling series with local average samples.展开更多
BACKGROUND The mucosal barrier's immune-brain interactions,pivotal for neural development and function,are increasingly recognized for their potential causal and therapeutic relevance to irritable bowel syndrome(I...BACKGROUND The mucosal barrier's immune-brain interactions,pivotal for neural development and function,are increasingly recognized for their potential causal and therapeutic relevance to irritable bowel syndrome(IBS).Prior studies linking immune inflammation with IBS have been inconsistent.To further elucidate this relationship,we conducted a Mendelian randomization(MR)analysis of 731 immune cell markers to dissect the influence of various immune phenotypes on IBS.Our goal was to deepen our understanding of the disrupted brain-gut axis in IBS and to identify novel therapeutic targets.AIM To leverage publicly available data to perform MR analysis on 731 immune cell markers and explore their impact on IBS.We aimed to uncover immunophenotypic associations with IBS that could inform future drug development and therapeutic strategies.METHODS We performed a comprehensive two-sample MR analysis to evaluate the causal relationship between immune cell markers and IBS.By utilizing genetic data from public databases,we examined the causal associations between 731 immune cell markers,encompassing median fluorescence intensity,relative cell abundance,absolute cell count,and morphological parameters,with IBS susceptibility.Sensitivity analyses were conducted to validate our findings and address potential heterogeneity and pleiotropy.RESULTS Bidirectional false discovery rate correction indicated no significant influence of IBS on immunophenotypes.However,our analysis revealed a causal impact of IBS on 30 out of 731 immune phenotypes(P<0.05).Nine immune phenotypes demonstrated a protective effect against IBS[inverse variance weighting(IVW)<0.05,odd ratio(OR)<1],while 21 others were associated with an increased risk of IBS onset(IVW≥0.05,OR≥1).CONCLUSION Our findings underscore a substantial genetic correlation between immune cell phenotypes and IBS,providing valuable insights into the pathophysiology of the condition.These results pave the way for the development of more precise biomarkers and targeted therapies for IBS.Furthermore,this research enriches our comprehension of immune cell roles in IBS pathogenesis,offering a foundation for more effective,personalized treatment approaches.These advancements hold promise for improving IBS patient quality of life and reducing the disease burden on individuals and their families.展开更多
As massive underground projects have become popular in dense urban cities,a problem has arisen:which model predicts the best for Tunnel Boring Machine(TBM)performance in these tunneling projects?However,performance le...As massive underground projects have become popular in dense urban cities,a problem has arisen:which model predicts the best for Tunnel Boring Machine(TBM)performance in these tunneling projects?However,performance level of TBMs in complex geological conditions is still a great challenge for practitioners and researchers.On the other hand,a reliable and accurate prediction of TBM performance is essential to planning an applicable tunnel construction schedule.The performance of TBM is very difficult to estimate due to various geotechnical and geological factors and machine specifications.The previously-proposed intelligent techniques in this field are mostly based on a single or base model with a low level of accuracy.Hence,this study aims to introduce a hybrid randomforest(RF)technique optimized by global harmony search with generalized oppositionbased learning(GOGHS)for forecasting TBM advance rate(AR).Optimizing the RF hyper-parameters in terms of,e.g.,tree number and maximum tree depth is the main objective of using the GOGHS-RF model.In the modelling of this study,a comprehensive databasewith themost influential parameters onTBMtogetherwithTBM AR were used as input and output variables,respectively.To examine the capability and power of the GOGHSRF model,three more hybrid models of particle swarm optimization-RF,genetic algorithm-RF and artificial bee colony-RF were also constructed to forecast TBM AR.Evaluation of the developed models was performed by calculating several performance indices,including determination coefficient(R2),root-mean-square-error(RMSE),and mean-absolute-percentage-error(MAPE).The results showed that theGOGHS-RF is a more accurate technique for estimatingTBMAR compared to the other applied models.The newly-developedGOGHS-RFmodel enjoyed R2=0.9937 and 0.9844,respectively,for train and test stages,which are higher than a pre-developed RF.Also,the importance of the input parameters was interpreted through the SHapley Additive exPlanations(SHAP)method,and it was found that thrust force per cutter is the most important variable on TBMAR.The GOGHS-RF model can be used in mechanized tunnel projects for predicting and checking performance.展开更多
基金ProjectsupportedbytheNationalNaturalScienceFoundationofChina (No .40 1 71 0 78) ,FundfromHongKongPolytechnicUniversity (No.1 .34 .970 9)andtheResearchGrantsCouncilofHongKongSAR (No .3 ZB40 ) .
文摘On the basis of the principles of simple random sampling, the statistical model of rate of disfigurement (RD) is put forward and described in detail. According to the definition of simple random sampling for the attribute data in GIS, the mean and variance of the RD are deduced as the characteristic value of the statistical model in order to explain the feasibility of the accuracy measurement of the attribute data in GIS by using the RD. Moreover, on the basis of the mean and variance of the RD, the quality assessment method for attribute data of vector maps during the data collecting is discussed. The RD spread graph is also drawn to see whether the quality of the attribute data is under control. The RD model can synthetically judge the quality of attribute data, which is different from other measurement coefficients that only discuss accuracy of classification.
文摘A composite random variable is a product (or sum of products) of statistically distributed quantities. Such a variable can represent the solution to a multi-factor quantitative problem submitted to a large, diverse, independent, anonymous group of non-expert respondents (the “crowd”). The objective of this research is to examine the statistical distribution of solutions from a large crowd to a quantitative problem involving image analysis and object counting. Theoretical analysis by the author, covering a range of conditions and types of factor variables, predicts that composite random variables are distributed log-normally to an excellent approximation. If the factors in a problem are themselves distributed log-normally, then their product is rigorously log-normal. A crowdsourcing experiment devised by the author and implemented with the assistance of a BBC (British Broadcasting Corporation) television show, yielded a sample of approximately 2000 responses consistent with a log-normal distribution. The sample mean was within ~12% of the true count. However, a Monte Carlo simulation (MCS) of the experiment, employing either normal or log-normal random variables as factors to model the processes by which a crowd of 1 million might arrive at their estimates, resulted in a visually perfect log-normal distribution with a mean response within ~5% of the true count. The results of this research suggest that a well-modeled MCS, by simulating a sample of responses from a large, rational, and incentivized crowd, can provide a more accurate solution to a quantitative problem than might be attainable by direct sampling of a smaller crowd or an uninformed crowd, irrespective of size, that guesses randomly.
文摘In this work empirical models describing sampling error (Δ) are reported based upon analytical findings elicited from 3 common probability density functions (PDF): the Gaussian, representing any real-valued, randomly changing variable x of mean μ?and standard deviation σthe Poisson, representing counting data: i.e., any integral-valued entity’s count of x (cells, clumps of cells or colony forming units, molecules, mutations, etc.) per tested volume, area, length of time, etc. with population mean of μ?and;binomial data representing the number of successful occurrences of something (x+) out of n observations or sub-samplings. These data were generated in such a way as to simulate what should be observed in practice but avoid other forms of experimental error. Based upon analyses of 104 Δ?measurements, we show that the average Δ?() is proportional to ?(σx•μ-1;Gaussian) or ?(Poisson & binomial). The average proportionality constants associated with these disparate populations were also nearly identical (;±s). However, since ?for any Poisson process, . In a similar vein, we have empirically demonstrated that binomial-associated ?were also proportional to σx•μ-1. Furthermore, we established that, when all ?were plotted against either ?or σx•μ-1, there was only one relationship with a slope = A (0.767 ± 0.0990) and a near-zero intercept. This latter finding also argues that all , regardless of parent PDF, are proportional to σx•μ-1?which is the coefficient of variation for a population of sample means (). Lastly, we establish that the proportionality constant A is equivalent to the coefficient of variation associated with Δ?() measurement and, therefore, . These results are noteworthy inasmuch as they provide a straightforward empirical link between stochastic sampling error and the aforementioned Cvs. Finally, we demonstrate that all attendant empirical measures of Δ?are reasonably small (e.g., ) when an environmental microbiome was well-sampled: n = 16 - 18 observations with μ∼3?isolates per observation. These colony counting results were supported by the fact that the two major isolates’ relative abundance was reproducible in the four most probable composition observations from one common population.
文摘The aim of this study is to investigate the impacts of the sampling strategy of landslide and non-landslide on the performance of landslide susceptibility assessment(LSA).The study area is the Feiyun catchment in Wenzhou City,Southeast China.Two types of landslides samples,combined with seven non-landslide sampling strategies,resulted in a total of 14 scenarios.The corresponding landslide susceptibility map(LSM)for each scenario was generated using the random forest model.The receiver operating characteristic(ROC)curve and statistical indicators were calculated and used to assess the impact of the dataset sampling strategy.The results showed that higher accuracies were achieved when using the landslide core as positive samples,combined with non-landslide sampling from the very low zone or buffer zone.The results reveal the influence of landslide and non-landslide sampling strategies on the accuracy of LSA,which provides a reference for subsequent researchers aiming to obtain a more reasonable LSM.
基金The Science Research Start-up Foundation for Young Teachers of Southwest Jiaotong University(No.2007Q091)
文摘In general the accuracy of mean estimator can be improved by stratified random sampling. In this paper, we provide an idea different from empirical methods that the accuracy can be more improved through bootstrap resampling method under some conditions. The determination of sample size by bootstrap method is also discussed, and a simulation is made to verify the accuracy of the proposed method. The simulation results show that the sample size based on bootstrapping is smaller than that based on central limit theorem.
文摘In this paper, analysis of methodology was realized for the application of stratified random sampling with optimum allocation in the case of a subject of research which concerns the rural population and presents high differentiations among the three strata in which this population could be classified. The rural population of Evros Prefecture (Greece) with criterion the mean altitude of settlements was classified in three strata, the mountainous, semi-mountainous and fiat population for the estimation of mean consumption of forest fuelwood for covering of heating and cooking needs in households of these three strata. The analysis of this methodology includes: (1) the determination of total size of sample for entire the rural population and its allocation to the various strata; (2) the investigation of effectiveness of stratification with the technique of analysis of variance (One-Way ANOVA); (3) the conduct of sampling research with the realization of face-to-face interviews in selected households and (4) the control of forms of the questionnaire and the analysis of data by using the statistical package for social sciences, SPSS for Windows. All data for the analysis of this methodology and its practical application were taken by the pilot sampling which was realized in each stratum. Relative paper was not found by the review of literature.
文摘In this paper, we propose a software component under Windows that generates pseudo random numbers using RDS (Refined Descriptive Sampling) as required by the simulation. RDS is regarded as the best sampling method as shown in the literature. In order to validate the proposed component, its implementation is proposed on approximating integrals. The simulation results from RDS using "RDSRnd" generator were compared to those obtained using the generator "Rnd" included in the Pascal programming language under Windows. The best results are given by the proposed software component.
文摘In this paper, auxiliary information is used to determine an estimator of finite population total using nonparametric regression under stratified random sampling. To achieve this, a model-based approach is adopted by making use of the local polynomial regression estimation to predict the nonsampled values of the survey variable y. The performance of the proposed estimator is investigated against some design-based and model-based regression estimators. The simulation experiments show that the resulting estimator exhibits good properties. Generally, good confidence intervals are seen for the nonparametric regression estimators, and use of the proposed estimator leads to relatively smaller values of RE compared to other estimators.
文摘The aim of this paper is to compare sample quality across two probability samples and one that uses probabilistic cluster sampling combined with random route and quota sampling within the selected clusters in order to define the ultimate survey units. All of them use the face-to-face interview as the survey procedure. The hypothesis to be tested is that it is possible to achieve the same degree of representativeness using a combination of random route sampling and quota sampling (with substitution) as it can be achieved by means of household sampling (without substitution) based on the municipal register of inhabitants. We have found such marked differences in the age and gender distribution of the probability sampling, where the deviations exceed 6%. A different picture emerges when it comes to comparing the employment variables, where the quota sampling overestimates the economic activity rate (2.5%) and the unemployment rate (8%) and underestimates the employment rate (3.46%).
文摘Srivastava and Jhajj [ 1 6] proposed a class of estimators for estimating population variance using multi auxiliary variables in simple random sampling and they utilized the means and variances of auxiliary variables. In this paper, we adapted this class and motivated by Searle [13], and we suggested more generalized class of estimators for estimating the population variance in simple random sampling. The expressions for the mean square error of proposed class have been derived in general form. Besides obtaining the minimized MSE of the proposed and adapted class, it is shown that the adapted classis the special case of the proposed class. Moreover, these theoretical findings are supported by an empirical study of original data.
文摘In this paper, the problem of nonparametric estimation of finite population quantile function using multiplicative bias correction technique is considered. A robust estimator of the finite population quantile function based on multiplicative bias correction is derived with the aid of a super population model. Most studies have concentrated on kernel smoothers in the estimation of regression functions. This technique has also been applied to various methods of non-parametric estimation of the finite population quantile already under review. A major problem with the use of nonparametric kernel-based regression over a finite interval, such as the estimation of finite population quantities, is bias at boundary points. By correcting the boundary problems associated with previous model-based estimators, the multiplicative bias corrected estimator produced better results in estimating the finite population quantile function. Furthermore, the asymptotic behavior of the proposed estimators </span><span style="font-family:Verdana;">is</span><span style="font-family:Verdana;"> presented</span><span style="font-family:Verdana;">. </span><span style="font-family:Verdana;">It is observed that the estimator is asymptotically unbiased and statistically consistent when certain conditions are satisfied. The simulation results show that the suggested estimator is quite well in terms of relative bias, mean squared error, and relative root mean error. As a result, the multiplicative bias corrected estimator is strongly suggested for survey sampling estimation of the finite population quantile function.
文摘One of the key assumptions in respondent-driven sampling (RDS) analysis, called “random selection assumption,” is that respondents randomly recruit their peers from their personal networks. The objective of this study was to verify this assumption in the empirical data of egocentric networks. Methods: We conducted an egocentric network study among young drug users in China, in which RDS was used to recruit this hard-to-reach population. If the random recruitment assumption holds, the RDS-estimated population proportions should be similar to the actual population proportions. Following this logic, we first calculated the population proportions of five visible variables (gender, age, education, marital status, and drug use mode) among the total drug-use alters from which the RDS sample was drawn, and then estimated the RDS-adjusted population proportions and their 95% confidence intervals in the RDS sample. Theoretically, if the random recruitment assumption holds, the 95% confidence intervals estimated in the RDS sample should include the population proportions calculated in the total drug-use alters. Results: The evaluation of the RDS sample indicated its success in reaching the convergence of RDS compositions and including a broad cross-section of the hidden population. Findings demonstrate that the random selection assumption holds for three group traits, but not for two others. Specifically, egos randomly recruited subjects in different age groups, marital status, or drug use modes from their network alters, but not in gender and education levels. Conclusions: This study demonstrates the occurrence of non-random recruitment, indicating that the recruitment of subjects in this RDS study was not completely at random. Future studies are needed to assess the extent to which the population proportion estimates can be biased when the violation of the assumption occurs in some group traits in RDS samples.
文摘In this study we have proposed a modified ratio type estimator for population variance of the study variable y under simple random sampling without replacement making use of coefficient of kurtosis and median of an auxiliary variable x. The estimator’s properties have been derived up to first order of Taylor’s series expansion. The efficiency conditions derived theoretically under which the proposed estimator performs better than existing estimators. Empirical studies have been done using real populations to demonstrate the performance of the developed estimator in comparison with the existing estimators. The proposed estimator as illustrated by the empirical studies performs better than the existing estimators under some specified conditions i.e. it has the smallest Mean Squared Error and the highest Percentage Relative Efficiency. The developed estimator therefore is suitable to be applied to situations in which the variable of interest has a positive correlation with the auxiliary variable.
文摘Non-response is a regular occurrence in Sample Surveys. Developing estimators when non-response exists may result in large biases when estimating population parameters. In this paper, a finite population mean is estimated when non-response exists randomly under two stage cluster sampling with replacement. It is assumed that non-response arises in the survey variable in the second stage of cluster sampling. Weighting method of compensating for non-response is applied. Asymptotic properties of the proposed estimator of the population mean are derived. Under mild assumptions, the estimator is shown to be asymptotically consistent.
基金supported by the National Key R&D Program of China(2020YFC2003102).
文摘Objective:To reveal the distribution characteristics and demographic factors of traditional Chinese medicine(TCM)constitution among elderly individuals in China.Methods: Elderly individuals from seven regions in China were selected as samples in this study using a multistage cluster random sampling method.The basic information questionnaire and Constitution in Chinese Medicine Questionnaire(Elderly Edition)were used.Descriptive statistical analysis,chi-squared tests,and binary logistic regression analysis were used.Results: The single balanced constitution(BC)accounted for 23.9%.The results of the major TCM constitution types showed that BC(43.2%)accounted for the largest proportion and unbalanced constitutions ranged from 0.9%to 15.7%.East China region(odds ratio[OR]=2.097;95%confidence interval[CI],1.912 to 2.301),married status(OR=1.341;95%CI,1.235 to 1.457),and managers(OR=1.254;95%CI,1.044 to 1.505)were significantly associated with BC.Age>70 years was associated with qi-deficiency constitution and blood stasis constitution(BSC).Female sex was significantly associated with yang-deficiency constitution(OR=1.646;95%CI,1.52 to 1.782).Southwest region was significantly associated with phlegm-dampness constitution(OR=1.809;95%CI,1.569 to 2.086).North China region was significantly associated with inherited special constitution(OR=2.521;95%CI,1.569 to 4.05).South China region(OR=2.741;95%CI,1.997 to 1.3.763),Central China region(OR=8.889;95%CI,6.676 to 11.835),senior middle school education(OR=2.442;95%CI,1.932 to 3.088),and managers(OR=1.804;95%CI,1.21 to 2.69)were significantly associated with BSC.Conclusions: This study defined the distribution characteristics and demographic factors of TCM constitution in the elderly population.Adjusting and improving unbalanced constitutions,which are correlated with diseases,can help promote healthy aging through the scientific management of these demographic factors.
基金Project supported by the National Natural Science Foundation of China(Nos.12272355,1202520411902294)+1 种基金the Opening Foundation of Shanxi Provincial Key Laboratory for Advanced Manufacturing Technology of China(No.XJZZ202304)the Shanxi Provincial Graduate Innovation Project of China(No.2023KY629)。
文摘In the practical environment,it is very common for the simultaneous occurrence of base excitation and crosswind.Scavenging the combined energy of vibration and wind with a single energy harvesting structure is fascinating.For this purpose,the effects of the wind speed and random excitation level are investigated with the stochastic averaging method(SAM)based on the energy envelope.The results of the analytical prediction are verified with the Monte-Carlo method(MCM).The numerical simulation shows that the introduction of wind can reduce the critical excitation level for triggering an inter-well jump and make a bi-stable energy harvester(BEH)realize the performance enhancement for a weak base excitation.However,as the strength of the wind increases to a particular level,the influence of the random base excitation on the dynamic responses is weakened,and the system exhibits a periodic galloping response.A comparison between a BEH and a linear energy harvester(LEH)indicates that the BEH demonstrates inferior performance for high-speed wind.Relevant experiments are conducted to investigate the validity of the theoretical prediction and numerical simulation.The experimental findings also show that strong random excitation is favorable for the BEH in the range of low wind speeds.However,as the speed of the incoming wind is up to a particular level,the disadvantage of the BEH becomes clear and evident.
基金supported by the Platform Development Foundation of the China Institute for Radiation Protection(No.YP21030101)the National Natural Science Foundation of China(General Program)(Nos.12175114,U2167209)+1 种基金the National Key R&D Program of China(No.2021YFF0603600)the Tsinghua University Initiative Scientific Research Program(No.20211080081).
文摘Global variance reduction is a bottleneck in Monte Carlo shielding calculations.The global variance reduction problem requires that the statistical error of the entire space is uniform.This study proposed a grid-AIS method for the global variance reduction problem based on the AIS method,which was implemented in the Monte Carlo program MCShield.The proposed method was validated using the VENUS-Ⅲ international benchmark problem and a self-shielding calculation example.The results from the VENUS-Ⅲ benchmark problem showed that the grid-AIS method achieved a significant reduction in the variance of the statistical errors of the MESH grids,decreasing from 1.08×10^(-2) to 3.84×10^(-3),representing a 64.00% reduction.This demonstrates that the grid-AIS method is effective in addressing global issues.The results of the selfshielding calculation demonstrate that the grid-AIS method produced accurate computational results.Moreover,the grid-AIS method exhibited a computational efficiency approximately one order of magnitude higher than that of the AIS method and approximately two orders of magnitude higher than that of the conventional Monte Carlo method.
基金National Natural Science Foundation of China (No60572113,No10501026) and Liuhui Center for Applied Mathematics
文摘Signals are often of random character since they cannot bear any information if they are predictable for any time t, they are usually modelled as stationary random processes .On the other hand, because of the inertia of the measurement apparatus, measured sampled values obtained in practice may not be the precise value of the signal X(t) at time tk (k∈Z), but only local averages of X(t) near tk. In this paper, it is presented that a wide (or weak ) sense stationary stochastic process can be approximated by generalized sampling series with local average samples.
文摘BACKGROUND The mucosal barrier's immune-brain interactions,pivotal for neural development and function,are increasingly recognized for their potential causal and therapeutic relevance to irritable bowel syndrome(IBS).Prior studies linking immune inflammation with IBS have been inconsistent.To further elucidate this relationship,we conducted a Mendelian randomization(MR)analysis of 731 immune cell markers to dissect the influence of various immune phenotypes on IBS.Our goal was to deepen our understanding of the disrupted brain-gut axis in IBS and to identify novel therapeutic targets.AIM To leverage publicly available data to perform MR analysis on 731 immune cell markers and explore their impact on IBS.We aimed to uncover immunophenotypic associations with IBS that could inform future drug development and therapeutic strategies.METHODS We performed a comprehensive two-sample MR analysis to evaluate the causal relationship between immune cell markers and IBS.By utilizing genetic data from public databases,we examined the causal associations between 731 immune cell markers,encompassing median fluorescence intensity,relative cell abundance,absolute cell count,and morphological parameters,with IBS susceptibility.Sensitivity analyses were conducted to validate our findings and address potential heterogeneity and pleiotropy.RESULTS Bidirectional false discovery rate correction indicated no significant influence of IBS on immunophenotypes.However,our analysis revealed a causal impact of IBS on 30 out of 731 immune phenotypes(P<0.05).Nine immune phenotypes demonstrated a protective effect against IBS[inverse variance weighting(IVW)<0.05,odd ratio(OR)<1],while 21 others were associated with an increased risk of IBS onset(IVW≥0.05,OR≥1).CONCLUSION Our findings underscore a substantial genetic correlation between immune cell phenotypes and IBS,providing valuable insights into the pathophysiology of the condition.These results pave the way for the development of more precise biomarkers and targeted therapies for IBS.Furthermore,this research enriches our comprehension of immune cell roles in IBS pathogenesis,offering a foundation for more effective,personalized treatment approaches.These advancements hold promise for improving IBS patient quality of life and reducing the disease burden on individuals and their families.
基金the National Natural Science Foundation of China(Grant 42177164)the Distinguished Youth Science Foundation of Hunan Province of China(2022JJ10073).
文摘As massive underground projects have become popular in dense urban cities,a problem has arisen:which model predicts the best for Tunnel Boring Machine(TBM)performance in these tunneling projects?However,performance level of TBMs in complex geological conditions is still a great challenge for practitioners and researchers.On the other hand,a reliable and accurate prediction of TBM performance is essential to planning an applicable tunnel construction schedule.The performance of TBM is very difficult to estimate due to various geotechnical and geological factors and machine specifications.The previously-proposed intelligent techniques in this field are mostly based on a single or base model with a low level of accuracy.Hence,this study aims to introduce a hybrid randomforest(RF)technique optimized by global harmony search with generalized oppositionbased learning(GOGHS)for forecasting TBM advance rate(AR).Optimizing the RF hyper-parameters in terms of,e.g.,tree number and maximum tree depth is the main objective of using the GOGHS-RF model.In the modelling of this study,a comprehensive databasewith themost influential parameters onTBMtogetherwithTBM AR were used as input and output variables,respectively.To examine the capability and power of the GOGHSRF model,three more hybrid models of particle swarm optimization-RF,genetic algorithm-RF and artificial bee colony-RF were also constructed to forecast TBM AR.Evaluation of the developed models was performed by calculating several performance indices,including determination coefficient(R2),root-mean-square-error(RMSE),and mean-absolute-percentage-error(MAPE).The results showed that theGOGHS-RF is a more accurate technique for estimatingTBMAR compared to the other applied models.The newly-developedGOGHS-RFmodel enjoyed R2=0.9937 and 0.9844,respectively,for train and test stages,which are higher than a pre-developed RF.Also,the importance of the input parameters was interpreted through the SHapley Additive exPlanations(SHAP)method,and it was found that thrust force per cutter is the most important variable on TBMAR.The GOGHS-RF model can be used in mechanized tunnel projects for predicting and checking performance.