On the basis of the principles of simple random sampling, the statistical model of rate of disfigurement (RD) is put forward and described in detail. According to the definition of simple random sampling for the attri...On the basis of the principles of simple random sampling, the statistical model of rate of disfigurement (RD) is put forward and described in detail. According to the definition of simple random sampling for the attribute data in GIS, the mean and variance of the RD are deduced as the characteristic value of the statistical model in order to explain the feasibility of the accuracy measurement of the attribute data in GIS by using the RD. Moreover, on the basis of the mean and variance of the RD, the quality assessment method for attribute data of vector maps during the data collecting is discussed. The RD spread graph is also drawn to see whether the quality of the attribute data is under control. The RD model can synthetically judge the quality of attribute data, which is different from other measurement coefficients that only discuss accuracy of classification.展开更多
Cost effective sampling design is a major concern in some experiments especially when the measurement of the characteristic of interest is costly or painful or time consuming.Ranked set sampling(RSS)was first proposed...Cost effective sampling design is a major concern in some experiments especially when the measurement of the characteristic of interest is costly or painful or time consuming.Ranked set sampling(RSS)was first proposed by McIntyre[1952.A method for unbiased selective sampling,using ranked sets.Australian Journal of Agricultural Research 3,385-390]as an effective way to estimate the pasture mean.In the current paper,a modification of ranked set sampling called moving extremes ranked set sampling(MERSS)is considered for the best linear unbiased estimators(BLUEs)for the simple linear regression model.The BLUEs for this model under MERSS are derived.The BLUEs under MERSS are shown to be markedly more efficient for normal data when compared with the BLUEs under simple random sampling.展开更多
Srivastava and Jhajj [ 1 6] proposed a class of estimators for estimating population variance using multi auxiliary variables in simple random sampling and they utilized the means and variances of auxiliary variables....Srivastava and Jhajj [ 1 6] proposed a class of estimators for estimating population variance using multi auxiliary variables in simple random sampling and they utilized the means and variances of auxiliary variables. In this paper, we adapted this class and motivated by Searle [13], and we suggested more generalized class of estimators for estimating the population variance in simple random sampling. The expressions for the mean square error of proposed class have been derived in general form. Besides obtaining the minimized MSE of the proposed and adapted class, it is shown that the adapted classis the special case of the proposed class. Moreover, these theoretical findings are supported by an empirical study of original data.展开更多
In this paper, the problem of nonparametric estimation of finite population quantile function using multiplicative bias correction technique is considered. A robust estimator of the finite population quantile function...In this paper, the problem of nonparametric estimation of finite population quantile function using multiplicative bias correction technique is considered. A robust estimator of the finite population quantile function based on multiplicative bias correction is derived with the aid of a super population model. Most studies have concentrated on kernel smoothers in the estimation of regression functions. This technique has also been applied to various methods of non-parametric estimation of the finite population quantile already under review. A major problem with the use of nonparametric kernel-based regression over a finite interval, such as the estimation of finite population quantities, is bias at boundary points. By correcting the boundary problems associated with previous model-based estimators, the multiplicative bias corrected estimator produced better results in estimating the finite population quantile function. Furthermore, the asymptotic behavior of the proposed estimators </span><span style="font-family:Verdana;">is</span><span style="font-family:Verdana;"> presented</span><span style="font-family:Verdana;">. </span><span style="font-family:Verdana;">It is observed that the estimator is asymptotically unbiased and statistically consistent when certain conditions are satisfied. The simulation results show that the suggested estimator is quite well in terms of relative bias, mean squared error, and relative root mean error. As a result, the multiplicative bias corrected estimator is strongly suggested for survey sampling estimation of the finite population quantile function.展开更多
This article proposes two new Ranked Set Sampling(RSS)designs for estimating the population parameters:Simple Z Ranked Set Sampling(SZRSS)and Generalized Z Ranked Set Sampling(GZRSS).These designs provide unbiased est...This article proposes two new Ranked Set Sampling(RSS)designs for estimating the population parameters:Simple Z Ranked Set Sampling(SZRSS)and Generalized Z Ranked Set Sampling(GZRSS).These designs provide unbiased estimators for the mean of symmetric distributions.It is shown that for non-uniform symmetric distributions,the estimators of the mean under the suggested designs are more efcient than those obtained by RSS,Simple Random Sampling(SRS),extreme RSS and truncation based RSS designs.Also,the proposed RSS schemes outperform other RSS schemes and provide more efcient estimates than their competitors under imperfect rankings.The suggested mean estimators under perfect and imperfect rankings are more efcient than the linear regression estimator under SRS.Our proposed RSS designs are also extended to cover the estimation of the population median.Real data is used to examine wthe usefulness and efciency of our estimators.展开更多
Background:The local pivotal method(LPM)utilizing auxiliary data in sample selection has recently been proposed as a sampling method for national forest inventories(NFIs).Its performance compared to simple random samp...Background:The local pivotal method(LPM)utilizing auxiliary data in sample selection has recently been proposed as a sampling method for national forest inventories(NFIs).Its performance compared to simple random sampling(SRS)and LPM with geographical coordinates has produced promising results in simulation studies.In this simulation study we compared all these sampling methods to systematic sampling.The LPM samples were selected solely using the coordinates(LPMxy)or,in addition to that,auxiliary remote sensing-based forest variables(RS variables).We utilized field measurement data(NFI-field)and Multi-Source NFI(MS-NFI)maps as target data,and independent MS-NFI maps as auxiliary data.The designs were compared using relative efficiency(RE);a ratio of mean squared errors of the reference sampling design against the studied design.Applying a method in NFI also requires a proven estimator for the variance.Therefore,three different variance estimators were evaluated against the empirical variance of replications:1)an estimator corresponding to SRS;2)a Grafström-Schelin estimator repurposed for LPM;and 3)a Matérn estimator applied in the Finnish NFI for systematic sampling design.Results:The LPMxy was nearly comparable with the systematic design for the most target variables.The REs of the LPM designs utilizing auxiliary data compared to the systematic design varied between 0.74–1.18,according to the studied target variable.The SRS estimator for variance was expectedly the most biased and conservative estimator.Similarly,the Grafström-Schelin estimator gave overestimates in the case of LPMxy.When the RS variables were utilized as auxiliary data,the Grafström-Schelin estimates tended to underestimate the empirical variance.In systematic sampling the Matérn and Grafström-Schelin estimators performed for practical purposes equally.Conclusions:LPM optimized for a specific variable tended to be more efficient than systematic sampling,but all of the considered LPM designs were less efficient than the systematic sampling design for some target variables.The Grafström-Schelin estimator could be used as such with LPMxy or instead of the Matérn estimator in systematic sampling.Further studies of the variance estimators are needed if other auxiliary variables are to be used in LPM.展开更多
This paper proposes a new method for increasing the precision in survey sam- pling, i.e., a method combining sampling with prediction. The two cases where auxiliary information is or not available are considered. A nu...This paper proposes a new method for increasing the precision in survey sam- pling, i.e., a method combining sampling with prediction. The two cases where auxiliary information is or not available are considered. A numerical example is given.展开更多
This paper reveaed some problems of the forest samling investigation from application.and pointed out the defects. Determining sample size method was precisely put forward from formla's origin in simple random Sam...This paper reveaed some problems of the forest samling investigation from application.and pointed out the defects. Determining sample size method was precisely put forward from formla's origin in simple random Samling procedure In stratified random samgling, two cases were distinguished: the variances Sh2 are equal for all h and not all Sh2 are equal This method made the assertion of making confidence interval more reliable.展开更多
The curve of relationship between fatigue crack growth rate and the stress strength factor amplitude represented an important fatigue property in designing of damage tolerance limits and predicting life of metallic co...The curve of relationship between fatigue crack growth rate and the stress strength factor amplitude represented an important fatigue property in designing of damage tolerance limits and predicting life of metallic component parts. In order to have a more reasonable use of testing data, samples from population were stratified suggested by the stratified random sample model (SRAM). The data in each stratum corresponded to the same experiment conditions. A suitable weight was assigned to each stratified sample according to the actual working states of the pressure vessel, so that the estimation of fatigue crack growth rate equation was more accurate for practice. An empirical study shows that the SRAM estimation by using fatigue crack growth rate data from different stoves is obviously better than the estimation from simple random sample model.展开更多
This manuscript deals with new class of almost unbiased ratio cum product estimators for the estimation of population mean of the study variable by using the known values auxiliary variable. The bias and mean squared ...This manuscript deals with new class of almost unbiased ratio cum product estimators for the estimation of population mean of the study variable by using the known values auxiliary variable. The bias and mean squared error of proposed estimators are obtained. An empirical study is carried out to assess the efficiency of proposed estimators over the existing estimators with the help of some known natural populations and it shows that the proposed estimators are almost unbiased and it perform better than the existing estimators.展开更多
This research aims to develop a model to enhance lymphatic diseases diagnosis by the use of random forest ensemble machine-learning method trained with a simple sampling scheme. This study has been carried out in two ...This research aims to develop a model to enhance lymphatic diseases diagnosis by the use of random forest ensemble machine-learning method trained with a simple sampling scheme. This study has been carried out in two major phases: feature selection and classification. In the first stage, a number of discriminative features out of 18 were selected using PSO and several feature selection techniques to reduce the features dimension. In the second stage, we applied the random forest ensemble classification scheme to diagnose lymphatic diseases. While making experiments with the selected features, we used original and resampled distributions of the dataset to train random forest classifier. Experimental results demonstrate that the proposed method achieves a remark-able improvement in classification accuracy rate.展开更多
This paper presents some novel entropy estimators of a continuous random variable using simple random sampling(SRS),ranked set sampling(RSS),and double RSS(DRSS)schemes.The theoretical results of the proposed entropy ...This paper presents some novel entropy estimators of a continuous random variable using simple random sampling(SRS),ranked set sampling(RSS),and double RSS(DRSS)schemes.The theoretical results of the proposed entropy estimators are derived.The proposed entropy estimators are compared in terms of the bias and the root mean squared errors,theoretically and numerically,with the Vasicek O.[A test for normality based on sample entropy,J.R.Stat.Soc.B 38:54–59,1976.]entropy estimators using SRS,RSS,and DRSS schemes.It turns out that the new novel entropy estimators are substantially better than the existing Vasicek’s entropy estimators.展开更多
In the present paper,we propose an efficient scrambled estimator of population mean of quantitative sensitive study variable,using general linear transformation of nonsensitive auxiliary variable.Efficiency comparison...In the present paper,we propose an efficient scrambled estimator of population mean of quantitative sensitive study variable,using general linear transformation of nonsensitive auxiliary variable.Efficiency comparisons with the existing estimators have been carried out both theoretically and numerically.It has been found that our optimal scrambled estimator is always more efficient than most of the existing scrambled estimators and also it is more efficient than few other scrambled estimators under some conditions.展开更多
This paper sheds light on all open problem put forward by Cochran[1]. The comparison between two commonly used variance estimators v1(^R) and v2(^R) of the ratio estimator R for population ratio R from small sample se...This paper sheds light on all open problem put forward by Cochran[1]. The comparison between two commonly used variance estimators v1(^R) and v2(^R) of the ratio estimator R for population ratio R from small sample selected by simple random sampling is made following the idea of the estimated loss approach (See [2]). Considering the superpopulation model under which the ratio estimator ^-YR for population mean -Y is the best linear unbiased one, the necessary and sufficient conditions for v1(^R) v2(^R) and v2(^R) v1(^R) are obtained with ignored the sampling fraction f. For a substantial f, several rigorous sufficient conditions for v2(^R) v1(^R) are derived.展开更多
基金ProjectsupportedbytheNationalNaturalScienceFoundationofChina (No .40 1 71 0 78) ,FundfromHongKongPolytechnicUniversity (No.1 .34 .970 9)andtheResearchGrantsCouncilofHongKongSAR (No .3 ZB40 ) .
文摘On the basis of the principles of simple random sampling, the statistical model of rate of disfigurement (RD) is put forward and described in detail. According to the definition of simple random sampling for the attribute data in GIS, the mean and variance of the RD are deduced as the characteristic value of the statistical model in order to explain the feasibility of the accuracy measurement of the attribute data in GIS by using the RD. Moreover, on the basis of the mean and variance of the RD, the quality assessment method for attribute data of vector maps during the data collecting is discussed. The RD spread graph is also drawn to see whether the quality of the attribute data is under control. The RD model can synthetically judge the quality of attribute data, which is different from other measurement coefficients that only discuss accuracy of classification.
基金Supported by the National Natural Science Foundation of China(11901236)the Scientific Research Fund of Hunan Provincial Science and Technology Department(2019JJ50479)+3 种基金the Scientific Research Fund of Hunan Provincial Education Department(18B322)the Winning Bid Project of Hunan Province for the 4th National Economic Census([2020]1)the Young Core Teacher Foundation of Hunan Province([2020]43)the Funda-mental Research Fund of Xiangxi Autonomous Prefecture(2018SF5026)。
文摘Cost effective sampling design is a major concern in some experiments especially when the measurement of the characteristic of interest is costly or painful or time consuming.Ranked set sampling(RSS)was first proposed by McIntyre[1952.A method for unbiased selective sampling,using ranked sets.Australian Journal of Agricultural Research 3,385-390]as an effective way to estimate the pasture mean.In the current paper,a modification of ranked set sampling called moving extremes ranked set sampling(MERSS)is considered for the best linear unbiased estimators(BLUEs)for the simple linear regression model.The BLUEs for this model under MERSS are derived.The BLUEs under MERSS are shown to be markedly more efficient for normal data when compared with the BLUEs under simple random sampling.
文摘Srivastava and Jhajj [ 1 6] proposed a class of estimators for estimating population variance using multi auxiliary variables in simple random sampling and they utilized the means and variances of auxiliary variables. In this paper, we adapted this class and motivated by Searle [13], and we suggested more generalized class of estimators for estimating the population variance in simple random sampling. The expressions for the mean square error of proposed class have been derived in general form. Besides obtaining the minimized MSE of the proposed and adapted class, it is shown that the adapted classis the special case of the proposed class. Moreover, these theoretical findings are supported by an empirical study of original data.
文摘In this paper, the problem of nonparametric estimation of finite population quantile function using multiplicative bias correction technique is considered. A robust estimator of the finite population quantile function based on multiplicative bias correction is derived with the aid of a super population model. Most studies have concentrated on kernel smoothers in the estimation of regression functions. This technique has also been applied to various methods of non-parametric estimation of the finite population quantile already under review. A major problem with the use of nonparametric kernel-based regression over a finite interval, such as the estimation of finite population quantities, is bias at boundary points. By correcting the boundary problems associated with previous model-based estimators, the multiplicative bias corrected estimator produced better results in estimating the finite population quantile function. Furthermore, the asymptotic behavior of the proposed estimators </span><span style="font-family:Verdana;">is</span><span style="font-family:Verdana;"> presented</span><span style="font-family:Verdana;">. </span><span style="font-family:Verdana;">It is observed that the estimator is asymptotically unbiased and statistically consistent when certain conditions are satisfied. The simulation results show that the suggested estimator is quite well in terms of relative bias, mean squared error, and relative root mean error. As a result, the multiplicative bias corrected estimator is strongly suggested for survey sampling estimation of the finite population quantile function.
基金The authors extend their appreciation to Deanship of Scientic Research at King Khalid University for funding this work through Research Groups Program under Grant No.R.G.P.2/68/41.I.M.A.and A.I.A.received the grant.
文摘This article proposes two new Ranked Set Sampling(RSS)designs for estimating the population parameters:Simple Z Ranked Set Sampling(SZRSS)and Generalized Z Ranked Set Sampling(GZRSS).These designs provide unbiased estimators for the mean of symmetric distributions.It is shown that for non-uniform symmetric distributions,the estimators of the mean under the suggested designs are more efcient than those obtained by RSS,Simple Random Sampling(SRS),extreme RSS and truncation based RSS designs.Also,the proposed RSS schemes outperform other RSS schemes and provide more efcient estimates than their competitors under imperfect rankings.The suggested mean estimators under perfect and imperfect rankings are more efcient than the linear regression estimator under SRS.Our proposed RSS designs are also extended to cover the estimation of the population median.Real data is used to examine wthe usefulness and efciency of our estimators.
基金the Ministry of Agriculture and Forestry key project“Puuta liikkeelle ja uusia tuotteita metsästä”(“Wood on the move and new products from forest”)Academy of Finland(project numbers 295100 , 306875).
文摘Background:The local pivotal method(LPM)utilizing auxiliary data in sample selection has recently been proposed as a sampling method for national forest inventories(NFIs).Its performance compared to simple random sampling(SRS)and LPM with geographical coordinates has produced promising results in simulation studies.In this simulation study we compared all these sampling methods to systematic sampling.The LPM samples were selected solely using the coordinates(LPMxy)or,in addition to that,auxiliary remote sensing-based forest variables(RS variables).We utilized field measurement data(NFI-field)and Multi-Source NFI(MS-NFI)maps as target data,and independent MS-NFI maps as auxiliary data.The designs were compared using relative efficiency(RE);a ratio of mean squared errors of the reference sampling design against the studied design.Applying a method in NFI also requires a proven estimator for the variance.Therefore,three different variance estimators were evaluated against the empirical variance of replications:1)an estimator corresponding to SRS;2)a Grafström-Schelin estimator repurposed for LPM;and 3)a Matérn estimator applied in the Finnish NFI for systematic sampling design.Results:The LPMxy was nearly comparable with the systematic design for the most target variables.The REs of the LPM designs utilizing auxiliary data compared to the systematic design varied between 0.74–1.18,according to the studied target variable.The SRS estimator for variance was expectedly the most biased and conservative estimator.Similarly,the Grafström-Schelin estimator gave overestimates in the case of LPMxy.When the RS variables were utilized as auxiliary data,the Grafström-Schelin estimates tended to underestimate the empirical variance.In systematic sampling the Matérn and Grafström-Schelin estimators performed for practical purposes equally.Conclusions:LPM optimized for a specific variable tended to be more efficient than systematic sampling,but all of the considered LPM designs were less efficient than the systematic sampling design for some target variables.The Grafström-Schelin estimator could be used as such with LPMxy or instead of the Matérn estimator in systematic sampling.Further studies of the variance estimators are needed if other auxiliary variables are to be used in LPM.
基金Supported by the National Natural Science Foundation of China
文摘This paper proposes a new method for increasing the precision in survey sam- pling, i.e., a method combining sampling with prediction. The two cases where auxiliary information is or not available are considered. A numerical example is given.
文摘This paper reveaed some problems of the forest samling investigation from application.and pointed out the defects. Determining sample size method was precisely put forward from formla's origin in simple random Samling procedure In stratified random samgling, two cases were distinguished: the variances Sh2 are equal for all h and not all Sh2 are equal This method made the assertion of making confidence interval more reliable.
文摘The curve of relationship between fatigue crack growth rate and the stress strength factor amplitude represented an important fatigue property in designing of damage tolerance limits and predicting life of metallic component parts. In order to have a more reasonable use of testing data, samples from population were stratified suggested by the stratified random sample model (SRAM). The data in each stratum corresponded to the same experiment conditions. A suitable weight was assigned to each stratified sample according to the actual working states of the pressure vessel, so that the estimation of fatigue crack growth rate equation was more accurate for practice. An empirical study shows that the SRAM estimation by using fatigue crack growth rate data from different stoves is obviously better than the estimation from simple random sample model.
文摘This manuscript deals with new class of almost unbiased ratio cum product estimators for the estimation of population mean of the study variable by using the known values auxiliary variable. The bias and mean squared error of proposed estimators are obtained. An empirical study is carried out to assess the efficiency of proposed estimators over the existing estimators with the help of some known natural populations and it shows that the proposed estimators are almost unbiased and it perform better than the existing estimators.
文摘This research aims to develop a model to enhance lymphatic diseases diagnosis by the use of random forest ensemble machine-learning method trained with a simple sampling scheme. This study has been carried out in two major phases: feature selection and classification. In the first stage, a number of discriminative features out of 18 were selected using PSO and several feature selection techniques to reduce the features dimension. In the second stage, we applied the random forest ensemble classification scheme to diagnose lymphatic diseases. While making experiments with the selected features, we used original and resampled distributions of the dataset to train random forest classifier. Experimental results demonstrate that the proposed method achieves a remark-able improvement in classification accuracy rate.
文摘This paper presents some novel entropy estimators of a continuous random variable using simple random sampling(SRS),ranked set sampling(RSS),and double RSS(DRSS)schemes.The theoretical results of the proposed entropy estimators are derived.The proposed entropy estimators are compared in terms of the bias and the root mean squared errors,theoretically and numerically,with the Vasicek O.[A test for normality based on sample entropy,J.R.Stat.Soc.B 38:54–59,1976.]entropy estimators using SRS,RSS,and DRSS schemes.It turns out that the new novel entropy estimators are substantially better than the existing Vasicek’s entropy estimators.
文摘In the present paper,we propose an efficient scrambled estimator of population mean of quantitative sensitive study variable,using general linear transformation of nonsensitive auxiliary variable.Efficiency comparisons with the existing estimators have been carried out both theoretically and numerically.It has been found that our optimal scrambled estimator is always more efficient than most of the existing scrambled estimators and also it is more efficient than few other scrambled estimators under some conditions.
基金the National Natural Science Foundation of China (No.10071091)
文摘This paper sheds light on all open problem put forward by Cochran[1]. The comparison between two commonly used variance estimators v1(^R) and v2(^R) of the ratio estimator R for population ratio R from small sample selected by simple random sampling is made following the idea of the estimated loss approach (See [2]). Considering the superpopulation model under which the ratio estimator ^-YR for population mean -Y is the best linear unbiased one, the necessary and sufficient conditions for v1(^R) v2(^R) and v2(^R) v1(^R) are obtained with ignored the sampling fraction f. For a substantial f, several rigorous sufficient conditions for v2(^R) v1(^R) are derived.