Suppose that there are two populations x and y with missing data on both of them, where x has a distribution function F(·) which is unknown and y has a distribution function Gθ(·) with a probability den...Suppose that there are two populations x and y with missing data on both of them, where x has a distribution function F(·) which is unknown and y has a distribution function Gθ(·) with a probability density function gθ(·) with known form depending on some unknown parameter θ. Fractional imputation is used to fill in missing data. The asymptotic distributions of the semi-empirical likelihood ration statistic are obtained under some mild conditions. Then, empirical likelihood confidence intervals on the differences of x and y are constructed.展开更多
Group testing is a method of pooling a number of units together and performing a single test on the resulting group. It is an appealing option when few individual units are thought to be infected leading to reduced co...Group testing is a method of pooling a number of units together and performing a single test on the resulting group. It is an appealing option when few individual units are thought to be infected leading to reduced costs of testing as compared to individually testing the units. Group testing aims to identify the positive groups in all the groups tested or to estimate the proportion of positives (p) in a population. Interval estimation methods of the proportions in group testing for unequal group sizes adjusted for overdispersion have been examined. Lately improvement in statistical methods allows the construction of highly accurate confidence intervals (CIs). The aim here is to apply group testing for estimation and generate highly accurate Bootstrap confidence intervals (CIs) for the proportion of defective or positive units in particular. This study provided a comparison of several proven methods of constructing CIs for a binomial proportion after adjusting for overdispersion in group testing with groups of unequal sizes. Bootstrap resampling was applied on data simulated from binomial distribution, and confidence intervals with high coverage probabilities were produced. This data was assumed to be overdispersed and independent between groups but correlated within these groups. Interval estimation methods based on the Wald, the Logit and Complementary log-log (CLL) functions were considered. The criterion used in the comparisons is mainly the coverage probabilities attained by nominal 95% CIs, though interval width is also regarded. Bootstrapping produced CIs with high coverage probabilities for each of the three interval methods.展开更多
This paper presents four methods of constructing the confidence interval for the proportion <i><span style="font-family:Verdana;">p</span></i><span style="font-family:;" ...This paper presents four methods of constructing the confidence interval for the proportion <i><span style="font-family:Verdana;">p</span></i><span style="font-family:;" "=""><span style="font-family:Verdana;"> of the binomial distribution. Evidence in the literature indicates the standard Wald confidence interval for the binomial proportion is inaccurate, especially for extreme values of </span><i><span style="font-family:Verdana;">p</span></i><span style="font-family:Verdana;">. Even for moderately large sample sizes, the coverage probabilities of the Wald confidence interval prove to be erratic for extreme values of </span><i><span style="font-family:Verdana;">p</span></i><span style="font-family:Verdana;">. Three alternative confidence intervals, namely, Wilson confidence interval, Clopper-Pearson interval, and likelihood interval</span></span><span style="font-family:Verdana;">,</span><span style="font-family:Verdana;"> are compared to the Wald confidence interval on the basis of coverage probability and expected length by means of simulation.</span>展开更多
The receiver operating characteristic (ROC) curve has been widely used in scientific research fields. After using the random hot deck imputation, we propose the smoothed empirical likelihood ratio statistic for the RO...The receiver operating characteristic (ROC) curve has been widely used in scientific research fields. After using the random hot deck imputation, we propose the smoothed empirical likelihood ratio statistic for the ROC curve with missing data. Its asymptotic distribution is a scaled chi-square distribution and empirical likelihood confidence intervals for ROC curves are constructed. The simulation study shows that the proposed interval estimates perform well based on the coverage probability for different sample sizes and response rates.展开更多
This article deals with correlating two variables that have values that fall below the known limit of detection (LOD) of the measuring device;these values are known as non-detects (NDs). We use simulation to compare s...This article deals with correlating two variables that have values that fall below the known limit of detection (LOD) of the measuring device;these values are known as non-detects (NDs). We use simulation to compare several methods for estimating the association between two such variables. The most commonly used method, simple substitution, consists of replacing each ND with some representative value such as LOD/2. Spearman’s correlation, in which all NDs are assumed to be tied at some value just smaller than the LOD, is also used. We evaluate each method under several scenarios, including small to moderate sample size, moderate to large censoring proportions, extr</span><span style="font-family:Verdana;">eme imbalance in censoring proportions, and non-bivariate nor</span><span style="font-family:Verdana;">mal (BVN) data. In this article, we focus on the coverage probability of 95% confidence intervals obtained using each method. Confidence intervals using a maximum likelihood approach based on the assumption of BVN data have acceptable performance under most scenarios, even with non-BVN data. Intervals based on Spearman’s coefficient also perform well under many conditions. The methods are illustrated using real data taken from the biomarker literature.展开更多
We discuss formulas and techniques for finding maximum-likelihood estimators of parameters of autoregressive (with particular emphasis on Markov and Yule) models, computing their asymptotic variance-covariance matrix ...We discuss formulas and techniques for finding maximum-likelihood estimators of parameters of autoregressive (with particular emphasis on Markov and Yule) models, computing their asymptotic variance-covariance matrix and displaying the resulting confidence regions;Monte Carlo simulation is then used to establish the accuracy of the corresponding level of confidence. The results indicate that a direct application of the Central Limit Theorem yields errors too large to be acceptable;instead, we recommend using a technique based directly on the natural logarithm of the likelihood function, verifying its substantially higher accuracy. Our study is then extended to the case of estimating only a subset of a model’s parameters, when the remaining ones (called nuisance) are of no interest to us.展开更多
Detecting population (group) differences is useful in many applications, such as medical research. In this paper, we explore the probabilistic theory for identifying the quantile differences .between two populations...Detecting population (group) differences is useful in many applications, such as medical research. In this paper, we explore the probabilistic theory for identifying the quantile differences .between two populations. Suppose that there are two populations x and y with missing data on both of them, where x is nonparametric and y is parametric. We are interested in constructing confidence intervals on the quantile differences of x and y. Random hot deck imputation is used to fill in missing data. Semi-empirical likelihood confidence intervals on the differences are constructed.展开更多
Hydrological risk is highly dependent on the occurrence of extreme rainfalls.This fact has led to a wide range of studies on the estimation and uncertainty analysis of the extremes.In most cases,confidence intervals(C...Hydrological risk is highly dependent on the occurrence of extreme rainfalls.This fact has led to a wide range of studies on the estimation and uncertainty analysis of the extremes.In most cases,confidence intervals(CIs)are constructed to represent the uncertainty of the estimates.Since the accuracy of CIs depends on the asymptotic normality of the data and is questionable with limited observations in practice,a Bayesian highest posterior density(HPD)interval,bootstrap percentile interval,and profile likelihood(PL)interval have been introduced to analyze the uncertainty that does not depend on the normality assumption.However,comparison studies to investigate their performances in terms of the accuracy and uncertainty of the estimates are scarce.In addition,the strengths,weakness,and conditions necessary for performing each method also must be investigated.Accordingly,in this study,test experiments with simulations from varying parent distributions and different sample sizes were conducted.Then,applications to the annual maximum rainfall(AMR)time series data in South Korea were performed.Five districts with 38-year(1973–2010)AMR observations were fitted by the three aforementioned methods in the application.From both the experimental and application results,the Bayesian method is found to provide the lowest uncertainty of the design level while the PL estimates generally have the highest accuracy but also the largest uncertainty.The bootstrap estimates are usually inferior to the other two methods,but can perform adequately when the distribution model is not heavy-tailed and the sample size is large.The distribution tail behavior and the sample size are clearly found to affect the estimation accuracy and uncertainty.This study presents a comparative result,which can help researchers make decisions in the context of assessing extreme rainfall uncertainties.展开更多
Suppose that there are two nonparametric populations x and y with missing data on both of them. We are interested in constructing confidence intervals on the quantile differences of x and y. Random imputation is used....Suppose that there are two nonparametric populations x and y with missing data on both of them. We are interested in constructing confidence intervals on the quantile differences of x and y. Random imputation is used. Empirical likelihood confidence intervals on the differences are constructed.展开更多
Point-wise confidence intervals for a nonparametric regression function with random design points are considered. The confidence intervals are those based on the traditional normal approximation and the empirical like...Point-wise confidence intervals for a nonparametric regression function with random design points are considered. The confidence intervals are those based on the traditional normal approximation and the empirical likelihood. Their coverage accuracy is assessed by developing the Edgeworth expansions for the coverage probabilities. It is shown that the empirical likelihood confidence intervals are Bartlett correctable.展开更多
Empirical likelihood is discussed by using the blockwise technique for strongly stationary, positively associated random variables. Our results show that the statistics is asymptotically chi-square distributed and the...Empirical likelihood is discussed by using the blockwise technique for strongly stationary, positively associated random variables. Our results show that the statistics is asymptotically chi-square distributed and the corresponding confidence interval can be constructed.展开更多
Profile likelihood function is introduced to analyze the uncertainty of hydrometeorological extreme inference and the theory of estimating confidence intervals of the key parameters and quantiles of extreme value dist...Profile likelihood function is introduced to analyze the uncertainty of hydrometeorological extreme inference and the theory of estimating confidence intervals of the key parameters and quantiles of extreme value distribution by profile likelihood function is described.GEV(generalized extreme value)distribution and GP(generalized Pareto)distribution are used respectively to fit the annual maximum daily flood discharge sample of the Yichang station in the Yangtze River and the daily rainfall sample in10 big cities including Guangzhou.The parameters of the models are estimated by maximum likelihood method and the fitting results are tested by probability plot,quantile plot,return level plot and density plot.The return levels and confidence intervals of flood and rainstorm in different return periods are calculated by profile likelihood function.The results show that the asymmetry of the profile likelihood function curve increases with the return period,which can reflect the effect of the length of sample series and return periods on confidence interval.As an effective tool for estimating confidence interval of the key parameters and quantiles of extreme value distribution,profile likelihood function can lead to a more accurate result and help to analyze the uncertainty of extreme values of hydrometeorology.展开更多
In this paper, two kinds of Kullback-Leibler criteria with appropriate constraints are proposed to construct empirical likelihood confidence intervals for the mean of right censored data. It is shown that one of the c...In this paper, two kinds of Kullback-Leibler criteria with appropriate constraints are proposed to construct empirical likelihood confidence intervals for the mean of right censored data. It is shown that one of the criteria is equivalent to Adimari’s(1997) procedure, and the other shares the same asymptotic behavior.展开更多
基金The NSF (10661003) of China,SRF for ROCS,SEM ([2004]527)the NSF (0728092) of GuangxiInnovation Project of Guangxi Graduate Education ([2006]40)
文摘Suppose that there are two populations x and y with missing data on both of them, where x has a distribution function F(·) which is unknown and y has a distribution function Gθ(·) with a probability density function gθ(·) with known form depending on some unknown parameter θ. Fractional imputation is used to fill in missing data. The asymptotic distributions of the semi-empirical likelihood ration statistic are obtained under some mild conditions. Then, empirical likelihood confidence intervals on the differences of x and y are constructed.
文摘Group testing is a method of pooling a number of units together and performing a single test on the resulting group. It is an appealing option when few individual units are thought to be infected leading to reduced costs of testing as compared to individually testing the units. Group testing aims to identify the positive groups in all the groups tested or to estimate the proportion of positives (p) in a population. Interval estimation methods of the proportions in group testing for unequal group sizes adjusted for overdispersion have been examined. Lately improvement in statistical methods allows the construction of highly accurate confidence intervals (CIs). The aim here is to apply group testing for estimation and generate highly accurate Bootstrap confidence intervals (CIs) for the proportion of defective or positive units in particular. This study provided a comparison of several proven methods of constructing CIs for a binomial proportion after adjusting for overdispersion in group testing with groups of unequal sizes. Bootstrap resampling was applied on data simulated from binomial distribution, and confidence intervals with high coverage probabilities were produced. This data was assumed to be overdispersed and independent between groups but correlated within these groups. Interval estimation methods based on the Wald, the Logit and Complementary log-log (CLL) functions were considered. The criterion used in the comparisons is mainly the coverage probabilities attained by nominal 95% CIs, though interval width is also regarded. Bootstrapping produced CIs with high coverage probabilities for each of the three interval methods.
文摘This paper presents four methods of constructing the confidence interval for the proportion <i><span style="font-family:Verdana;">p</span></i><span style="font-family:;" "=""><span style="font-family:Verdana;"> of the binomial distribution. Evidence in the literature indicates the standard Wald confidence interval for the binomial proportion is inaccurate, especially for extreme values of </span><i><span style="font-family:Verdana;">p</span></i><span style="font-family:Verdana;">. Even for moderately large sample sizes, the coverage probabilities of the Wald confidence interval prove to be erratic for extreme values of </span><i><span style="font-family:Verdana;">p</span></i><span style="font-family:Verdana;">. Three alternative confidence intervals, namely, Wilson confidence interval, Clopper-Pearson interval, and likelihood interval</span></span><span style="font-family:Verdana;">,</span><span style="font-family:Verdana;"> are compared to the Wald confidence interval on the basis of coverage probability and expected length by means of simulation.</span>
文摘The receiver operating characteristic (ROC) curve has been widely used in scientific research fields. After using the random hot deck imputation, we propose the smoothed empirical likelihood ratio statistic for the ROC curve with missing data. Its asymptotic distribution is a scaled chi-square distribution and empirical likelihood confidence intervals for ROC curves are constructed. The simulation study shows that the proposed interval estimates perform well based on the coverage probability for different sample sizes and response rates.
文摘This article deals with correlating two variables that have values that fall below the known limit of detection (LOD) of the measuring device;these values are known as non-detects (NDs). We use simulation to compare several methods for estimating the association between two such variables. The most commonly used method, simple substitution, consists of replacing each ND with some representative value such as LOD/2. Spearman’s correlation, in which all NDs are assumed to be tied at some value just smaller than the LOD, is also used. We evaluate each method under several scenarios, including small to moderate sample size, moderate to large censoring proportions, extr</span><span style="font-family:Verdana;">eme imbalance in censoring proportions, and non-bivariate nor</span><span style="font-family:Verdana;">mal (BVN) data. In this article, we focus on the coverage probability of 95% confidence intervals obtained using each method. Confidence intervals using a maximum likelihood approach based on the assumption of BVN data have acceptable performance under most scenarios, even with non-BVN data. Intervals based on Spearman’s coefficient also perform well under many conditions. The methods are illustrated using real data taken from the biomarker literature.
文摘We discuss formulas and techniques for finding maximum-likelihood estimators of parameters of autoregressive (with particular emphasis on Markov and Yule) models, computing their asymptotic variance-covariance matrix and displaying the resulting confidence regions;Monte Carlo simulation is then used to establish the accuracy of the corresponding level of confidence. The results indicate that a direct application of the Central Limit Theorem yields errors too large to be acceptable;instead, we recommend using a technique based directly on the natural logarithm of the likelihood function, verifying its substantially higher accuracy. Our study is then extended to the case of estimating only a subset of a model’s parameters, when the remaining ones (called nuisance) are of no interest to us.
基金Supported by the National Natural Science Foundation of China (10661003)the Natural Science Foundation of Guangxi (0728092)
文摘Detecting population (group) differences is useful in many applications, such as medical research. In this paper, we explore the probabilistic theory for identifying the quantile differences .between two populations. Suppose that there are two populations x and y with missing data on both of them, where x is nonparametric and y is parametric. We are interested in constructing confidence intervals on the quantile differences of x and y. Random hot deck imputation is used to fill in missing data. Semi-empirical likelihood confidence intervals on the differences are constructed.
基金supported by Hanyang University(Grant No.HY-2014)
文摘Hydrological risk is highly dependent on the occurrence of extreme rainfalls.This fact has led to a wide range of studies on the estimation and uncertainty analysis of the extremes.In most cases,confidence intervals(CIs)are constructed to represent the uncertainty of the estimates.Since the accuracy of CIs depends on the asymptotic normality of the data and is questionable with limited observations in practice,a Bayesian highest posterior density(HPD)interval,bootstrap percentile interval,and profile likelihood(PL)interval have been introduced to analyze the uncertainty that does not depend on the normality assumption.However,comparison studies to investigate their performances in terms of the accuracy and uncertainty of the estimates are scarce.In addition,the strengths,weakness,and conditions necessary for performing each method also must be investigated.Accordingly,in this study,test experiments with simulations from varying parent distributions and different sample sizes were conducted.Then,applications to the annual maximum rainfall(AMR)time series data in South Korea were performed.Five districts with 38-year(1973–2010)AMR observations were fitted by the three aforementioned methods in the application.From both the experimental and application results,the Bayesian method is found to provide the lowest uncertainty of the design level while the PL estimates generally have the highest accuracy but also the largest uncertainty.The bootstrap estimates are usually inferior to the other two methods,but can perform adequately when the distribution model is not heavy-tailed and the sample size is large.The distribution tail behavior and the sample size are clearly found to affect the estimation accuracy and uncertainty.This study presents a comparative result,which can help researchers make decisions in the context of assessing extreme rainfall uncertainties.
基金Supported by the National Natural Science Foundation of China(No.10661003)Natural Science Foundation of Guangxi(No.0728092)
文摘Suppose that there are two nonparametric populations x and y with missing data on both of them. We are interested in constructing confidence intervals on the quantile differences of x and y. Random imputation is used. Empirical likelihood confidence intervals on the differences are constructed.
文摘Point-wise confidence intervals for a nonparametric regression function with random design points are considered. The confidence intervals are those based on the traditional normal approximation and the empirical likelihood. Their coverage accuracy is assessed by developing the Edgeworth expansions for the coverage probabilities. It is shown that the empirical likelihood confidence intervals are Bartlett correctable.
基金the National Natural Science Foundation of China(No.10661003)
文摘Empirical likelihood is discussed by using the blockwise technique for strongly stationary, positively associated random variables. Our results show that the statistics is asymptotically chi-square distributed and the corresponding confidence interval can be constructed.
基金supported by the National Basic Research Program of China("973" Program)(Grant Nos.2013CB036406,2010CB951102)the National Natural Science Foundation of China(Grant No.51109224)
文摘Profile likelihood function is introduced to analyze the uncertainty of hydrometeorological extreme inference and the theory of estimating confidence intervals of the key parameters and quantiles of extreme value distribution by profile likelihood function is described.GEV(generalized extreme value)distribution and GP(generalized Pareto)distribution are used respectively to fit the annual maximum daily flood discharge sample of the Yichang station in the Yangtze River and the daily rainfall sample in10 big cities including Guangzhou.The parameters of the models are estimated by maximum likelihood method and the fitting results are tested by probability plot,quantile plot,return level plot and density plot.The return levels and confidence intervals of flood and rainstorm in different return periods are calculated by profile likelihood function.The results show that the asymmetry of the profile likelihood function curve increases with the return period,which can reflect the effect of the length of sample series and return periods on confidence interval.As an effective tool for estimating confidence interval of the key parameters and quantiles of extreme value distribution,profile likelihood function can lead to a more accurate result and help to analyze the uncertainty of extreme values of hydrometeorology.
基金The work is partially supported by the NSFof China(No:10071090) and the Chinese Academy of Sciences.
文摘In this paper, two kinds of Kullback-Leibler criteria with appropriate constraints are proposed to construct empirical likelihood confidence intervals for the mean of right censored data. It is shown that one of the criteria is equivalent to Adimari’s(1997) procedure, and the other shares the same asymptotic behavior.