Addressing the insufficiency in down-regulation leeway within integrated energy systems stemming from the erratic and volatile nature of wind and solar renewable energy generation,this study focuses on formulating a c...Addressing the insufficiency in down-regulation leeway within integrated energy systems stemming from the erratic and volatile nature of wind and solar renewable energy generation,this study focuses on formulating a coordinated strategy involving the carbon capture unit of the integrated energy system and the resources on the load storage side.A scheduling model is devised that takes into account the confidence interval associated with renewable energy generation,with the overarching goal of optimizing the system for low-carbon operation.To begin with,an in-depth analysis is conducted on the temporal energy-shifting attributes and the low-carbon modulation mechanisms exhibited by the source-side carbon capture power plant within the context of integrated and adaptable operational paradigms.Drawing from this analysis,a model is devised to represent the adjustable resources on the charge-storage side,predicated on the principles of electro-thermal coupling within the energy system.Subsequently,the dissimilarities in the confidence intervals of renewable energy generation are considered,leading to the proposition of a flexible upper threshold for the confidence interval.Building on this,a low-carbon dispatch model is established for the integrated energy system,factoring in the margin allowed by the adjustable resources.In the final phase,a simulation is performed on a regional electric heating integrated energy system.This simulation seeks to assess the impact of source-load-storage coordination on the system’s low-carbon operation across various scenarios of reduction margin reserves.The findings underscore that the proactive scheduling model incorporating confidence interval considerations for reduction margin reserves effectively mitigates the uncertainties tied to renewable energy generation.Through harmonized orchestration of source,load,and storage elements,it expands the utilization scope for renewable energy,safeguards the economic efficiency of system operations under low-carbon emission conditions,and empirically validates the soundness and efficacy of the proposed approach.展开更多
Purpose:We aim to extend our investigations related to the Relative Intensity of Collaboration(RIC)indicator,by constructing a confidence interval for the obtained values.Design/methodology/approach:We use Mantel-Haen...Purpose:We aim to extend our investigations related to the Relative Intensity of Collaboration(RIC)indicator,by constructing a confidence interval for the obtained values.Design/methodology/approach:We use Mantel-Haenszel statistics as applied recently by Smolinsky,Klingenberg,and Marx.Findings:We obtain confidence intervals for the RIC indicatorResearch limitations:It is not obvious that data obtained from the Web of Science(or any other database)can be considered a random sample.Practical implications:We explain how to calculate confidence intervals.Bibliometric indicators are more often than not presented as precise values instead of an approximation depending on the database and the time of measurement.Our approach presents a suggestion to solve this problem.Originality/value:Our approach combines the statistics of binary categorical data and bibliometric studies of collaboration.展开更多
Suppose that there are two populations x and y with missing data on both of them, where x has a distribution function F(·) which is unknown and y has a distribution function Gθ(·) with a probability den...Suppose that there are two populations x and y with missing data on both of them, where x has a distribution function F(·) which is unknown and y has a distribution function Gθ(·) with a probability density function gθ(·) with known form depending on some unknown parameter θ. Fractional imputation is used to fill in missing data. The asymptotic distributions of the semi-empirical likelihood ration statistic are obtained under some mild conditions. Then, empirical likelihood confidence intervals on the differences of x and y are constructed.展开更多
This paper provides methods for assessing the precision of cost elasticity estimates when the underlying regression function is assumed to be polynomial. Specifically, the paper adapts two well-known methods for compu...This paper provides methods for assessing the precision of cost elasticity estimates when the underlying regression function is assumed to be polynomial. Specifically, the paper adapts two well-known methods for computing confidential intervals for ratios: the delta-method and the Fieller method. We show that performing the estimation with mean-centered explanatory variables provides a straightforward way to estimate the elasticity and compute a confidence interval for it. A theoretical discussion of the proposed methods is provided, as well as an empirical example based on publicly available postal data. Possible areas of application include postal service providers worldwide, transportation and electricity.展开更多
This paper presents four methods of constructing the confidence interval for the proportion <i><span style="font-family:Verdana;">p</span></i><span style="font-family:;" ...This paper presents four methods of constructing the confidence interval for the proportion <i><span style="font-family:Verdana;">p</span></i><span style="font-family:;" "=""><span style="font-family:Verdana;"> of the binomial distribution. Evidence in the literature indicates the standard Wald confidence interval for the binomial proportion is inaccurate, especially for extreme values of </span><i><span style="font-family:Verdana;">p</span></i><span style="font-family:Verdana;">. Even for moderately large sample sizes, the coverage probabilities of the Wald confidence interval prove to be erratic for extreme values of </span><i><span style="font-family:Verdana;">p</span></i><span style="font-family:Verdana;">. Three alternative confidence intervals, namely, Wilson confidence interval, Clopper-Pearson interval, and likelihood interval</span></span><span style="font-family:Verdana;">,</span><span style="font-family:Verdana;"> are compared to the Wald confidence interval on the basis of coverage probability and expected length by means of simulation.</span>展开更多
In data envelopment analysis (DEA), input and output values are subject to change for several reasons. Such variations differ in their input/output items and their decision-making units (DMUs). Hence, DEA efficiency s...In data envelopment analysis (DEA), input and output values are subject to change for several reasons. Such variations differ in their input/output items and their decision-making units (DMUs). Hence, DEA efficiency scores need to be examined by considering these factors. In this paper, we propose new resampling models based on these variations for gauging the confidence intervals of DEA scores. The first model utilizes past-present data for estimating data variations imposing chronological order weights which are supplied by Lucas series (a variant of Fibonacci series). The second model deals with future prospects. This model aims at forecasting the future efficiency score and its confidence interval for each DMU. We applied our models to a dataset composed of Japanese municipal hospitals.展开更多
Background:Markov chains(MC)have been widely used to model molecular sequences.The estimations of MC transition matrix and confidence intervals of the transition probabilities from long sequence data have been intensi...Background:Markov chains(MC)have been widely used to model molecular sequences.The estimations of MC transition matrix and confidence intervals of the transition probabilities from long sequence data have been intensively studied in the past decades.In next generation sequencing(NGS),a large amount of short reads are generated.These short reads can overlap and some regions of the genome may not be sequenced resulting in a new type of data.Based on NGS data,the transition probabilities of MC can be estimated by moment estimators.However,the classical asymptotic distribution theory for MC transition probability estimators based on long sequences is no longer valid.Methods:In this study,we present the asymptotic distributions of several statistics related to MC based on NGS data.We show that,after scaling by the effective coverage d defined in a previous study by the authors,these statistics based on NGS data approximate to the same distributions as the corresponding statistics for long sequences.Results:We apply the asymptotic properties of these statistics for finding the theoretical confidence regions for MC transition probabilities based on NGS short reads data.We validate our theoretical confidence intervals using both simulated data and real data sets,and compare the results with those by the parametric bootstrap method.Conclusions:We find that the asymptotic distributions of these statistics and the theoretical confidence intervals of transition probabilities based on NGS data given in this study are highly accurate,providing a powerful tool for NGS data analysis.展开更多
In cancer survival analysis, it is very frequently to estimate the confidence intervals for survival probabilities.But this calculation is not commonly involve in most popular computer packages, or only one methods of...In cancer survival analysis, it is very frequently to estimate the confidence intervals for survival probabilities.But this calculation is not commonly involve in most popular computer packages, or only one methods of estimation in the packages. In the present Paper, we will describe a microcomputer Program for estimating the confidence intervals of survival probabilities, when the survival functions are estimated using Kaplan-Meier product-limit or life-table method. There are five methods of estimation in the program (SPCI), which are the classical(based on Greenwood's formula of variance of S(ti), Rothman-Wilson, arcsin transformation, log(-Iog) transformation, Iogit transformation methods. Two example analysis are given for testing the performances of the program running.展开更多
We discuss formulas and techniques for finding maximum-likelihood estimators of parameters of autoregressive (with particular emphasis on Markov and Yule) models, computing their asymptotic variance-covariance matrix ...We discuss formulas and techniques for finding maximum-likelihood estimators of parameters of autoregressive (with particular emphasis on Markov and Yule) models, computing their asymptotic variance-covariance matrix and displaying the resulting confidence regions;Monte Carlo simulation is then used to establish the accuracy of the corresponding level of confidence. The results indicate that a direct application of the Central Limit Theorem yields errors too large to be acceptable;instead, we recommend using a technique based directly on the natural logarithm of the likelihood function, verifying its substantially higher accuracy. Our study is then extended to the case of estimating only a subset of a model’s parameters, when the remaining ones (called nuisance) are of no interest to us.展开更多
Various random models with balanced data that are relevant for analyzing practical test data are described, along with several hypothesis testing and interval estimation problems concerning variance components. In thi...Various random models with balanced data that are relevant for analyzing practical test data are described, along with several hypothesis testing and interval estimation problems concerning variance components. In this paper, we mainly consider these problems in general random effect model with balanced data. Exact tests and confidence intervals for a single variance component corresponding to random effect are developed by using generalized p-values and generalized confidence intervals. The resulting procedures are easy to compute and are applicable to small samples. Exact tests and confidence intervals are also established for comparing the random-effects variance components and the sum of random-effects variance components in two independent general random effect models with balanced data. Furthermore, we investigate the statistical properties of the resulting tests. Finally, some simulation results on the type Ⅰ error probability and power of the proposed test are reported. The simulation results indicate that exact test is extremely satisfactory for controlling type Ⅰ error probability.展开更多
Hydrological risk is highly dependent on the occurrence of extreme rainfalls.This fact has led to a wide range of studies on the estimation and uncertainty analysis of the extremes.In most cases,confidence intervals(C...Hydrological risk is highly dependent on the occurrence of extreme rainfalls.This fact has led to a wide range of studies on the estimation and uncertainty analysis of the extremes.In most cases,confidence intervals(CIs)are constructed to represent the uncertainty of the estimates.Since the accuracy of CIs depends on the asymptotic normality of the data and is questionable with limited observations in practice,a Bayesian highest posterior density(HPD)interval,bootstrap percentile interval,and profile likelihood(PL)interval have been introduced to analyze the uncertainty that does not depend on the normality assumption.However,comparison studies to investigate their performances in terms of the accuracy and uncertainty of the estimates are scarce.In addition,the strengths,weakness,and conditions necessary for performing each method also must be investigated.Accordingly,in this study,test experiments with simulations from varying parent distributions and different sample sizes were conducted.Then,applications to the annual maximum rainfall(AMR)time series data in South Korea were performed.Five districts with 38-year(1973–2010)AMR observations were fitted by the three aforementioned methods in the application.From both the experimental and application results,the Bayesian method is found to provide the lowest uncertainty of the design level while the PL estimates generally have the highest accuracy but also the largest uncertainty.The bootstrap estimates are usually inferior to the other two methods,but can perform adequately when the distribution model is not heavy-tailed and the sample size is large.The distribution tail behavior and the sample size are clearly found to affect the estimation accuracy and uncertainty.This study presents a comparative result,which can help researchers make decisions in the context of assessing extreme rainfall uncertainties.展开更多
Detecting population (group) differences is useful in many applications, such as medical research. In this paper, we explore the probabilistic theory for identifying the quantile differences .between two populations...Detecting population (group) differences is useful in many applications, such as medical research. In this paper, we explore the probabilistic theory for identifying the quantile differences .between two populations. Suppose that there are two populations x and y with missing data on both of them, where x is nonparametric and y is parametric. We are interested in constructing confidence intervals on the quantile differences of x and y. Random hot deck imputation is used to fill in missing data. Semi-empirical likelihood confidence intervals on the differences are constructed.展开更多
Point-wise confidence intervals for a nonparametric regression function with random design points are considered. The confidence intervals are those based on the traditional normal approximation and the empirical like...Point-wise confidence intervals for a nonparametric regression function with random design points are considered. The confidence intervals are those based on the traditional normal approximation and the empirical likelihood. Their coverage accuracy is assessed by developing the Edgeworth expansions for the coverage probabilities. It is shown that the empirical likelihood confidence intervals are Bartlett correctable.展开更多
This paper considers two estimators of θ= g(x) in a nonparametric regression model Y = g(x) + ε(x∈ (0, 1)p) with missing responses: Imputation and inverse probability weighted esti- mators. Asymptotic nor...This paper considers two estimators of θ= g(x) in a nonparametric regression model Y = g(x) + ε(x∈ (0, 1)p) with missing responses: Imputation and inverse probability weighted esti- mators. Asymptotic normality of the two estimators is established, which is used to construct normal approximation based confidence intervals on θ.展开更多
Suppose that there are two nonparametric populations x and y with missing data on both of them. We are interested in constructing confidence intervals on the quantile differences of x and y. Random imputation is used....Suppose that there are two nonparametric populations x and y with missing data on both of them. We are interested in constructing confidence intervals on the quantile differences of x and y. Random imputation is used. Empirical likelihood confidence intervals on the differences are constructed.展开更多
In this paper,Scheffé and Simplified Scheffé simultaneous confidence intervals are firstconstructed for mean difference of several multivariate normal distributions.Then the authors theoreticallyprove that w...In this paper,Scheffé and Simplified Scheffé simultaneous confidence intervals are firstconstructed for mean difference of several multivariate normal distributions.Then the authors theoreticallyprove that when there are only two populations,Bonferroni bounds and Simplified Scheffébounds are the same and they are shorter than Scheffé bounds for p10.In the case for 3k10and 2p10,there exists n(p,k)such that Bonferroni method is better than Simplified Schefféprocedure for nn(p,k),otherwise Simplified Scheffé procedure is better.Finally,the authors findout that neither of Scheffé critical values nor Simplified Scheffé critical values are always larger thananother through numerical calculation.展开更多
Although there are many measures of variability for qualitative variables, they are little used in social research, nor are they included in statistical software. The aim of this article is to present six measures of ...Although there are many measures of variability for qualitative variables, they are little used in social research, nor are they included in statistical software. The aim of this article is to present six measures of variation for qualitative variables of simple calculation, as well as to facilitate their use by means of the R software. The measures considered are, on the one hand, Freemans variation ratio, Morals universal variation ratio, Kvalseths standard deviation from the mode, and Wilcoxs variation ratio which are most affected by proximity to a constant random variable, where the measures of variability for qualitative variables reach their minimum value of 0. On the other hand, the Gibbs-Poston index of qualitative variation and Shannons relative entropy are included, which are more affected by the proximity to a uniform distribution, where the measures of variability for qualitative variables reach their maximum value of 1. Point and interval estimation are addressed. Bootstrap by the percentile and bias-corrected and accelerated percentile methods are used to obtain confidence intervals. Two calculation situations are presented: with a sample mode and with two or more modes. The standard deviation from the mode among the six considered measures, and the universal variation ratio among the three variation ratios, are particularly recommended for use.展开更多
To improve the forecasting reliability of travel time, the time-varying confidence interval of travel time on arterials is forecasted using an autoregressive integrated moving average and generalized autoregressive co...To improve the forecasting reliability of travel time, the time-varying confidence interval of travel time on arterials is forecasted using an autoregressive integrated moving average and generalized autoregressive conditional heteroskedasticity (ARIMA-GARCH) model. In which, the ARIMA model is used as the mean equation of the GARCH model to model the travel time levels and the GARCH model is used to model the conditional variances of travel time. The proposed method is validated and evaluated using actual traffic flow data collected from the traffic monitoring system of Kunshan city. The evaluation results show that, compared with the conventional ARIMA model, the proposed model cannot significantly improve the forecasting performance of travel time levels but has advantage in travel time volatility forecasting. The proposed model can well capture the travel time heteroskedasticity and forecast the time-varying confidence intervals of travel time which can better reflect the volatility of observed travel times than the fixed confidence interval provided by the ARIMA model.展开更多
The random finite difference method(RFDM) is a popular approach to quantitatively evaluate the influence of inherent spatial variability of soil on the deformation of embedded tunnels.However,the high computational co...The random finite difference method(RFDM) is a popular approach to quantitatively evaluate the influence of inherent spatial variability of soil on the deformation of embedded tunnels.However,the high computational cost is an ongoing challenge for its application in complex scenarios.To address this limitation,a deep learning-based method for efficient prediction of tunnel deformation in spatially variable soil is proposed.The proposed method uses one-dimensional convolutional neural network(CNN) to identify the pattern between random field input and factor of safety of tunnel deformation output.The mean squared error and correlation coefficient of the CNN model applied to the newly untrained dataset was less than 0.02 and larger than 0.96,respectively.It means that the trained CNN model can replace RFDM analysis for Monte Carlo simulations with a small but sufficient number of random field samples(about 40 samples for each case in this study).It is well known that the machine learning or deep learning model has a common limitation that the confidence of predicted result is unknown and only a deterministic outcome is given.This calls for an approach to gauge the model’s confidence interval.It is achieved by applying dropout to all layers of the original model to retrain the model and using the dropout technique when performing inference.The excellent agreement between the CNN model prediction and the RFDM calculated results demonstrated that the proposed deep learning-based method has potential for tunnel performance analysis in spatially variable soils.展开更多
A novel damage detection method is applied to a 3-story frame structure, to obtain statistical quantification control criterion of the existence, location and identification of damage. The mean, standard deviation, an...A novel damage detection method is applied to a 3-story frame structure, to obtain statistical quantification control criterion of the existence, location and identification of damage. The mean, standard deviation, and exponentially weighted moving average (EWMA) are applied to detect damage information according to statistical process control (SPC) theory. It is concluded that the detection is insignificant with the mean and EWMA because the structural response is not independent and is not a normal distribution. On the other hand, the damage information is detected well with the standard deviation because the influence of the data distribution is not pronounced with this parameter. A suitable moderate confidence level is explored for more significant damage location and quantification detection, and the impact of noise is investigated to illustrate the robustness of the method.展开更多
基金supported by the Science and Technology Project of State Grid Inner Mongolia East Power Co.,Ltd.:Research on Carbon Flow Apportionment and Assessment Methods for Distributed Energy under Dual Carbon Targets(52664K220004).
文摘Addressing the insufficiency in down-regulation leeway within integrated energy systems stemming from the erratic and volatile nature of wind and solar renewable energy generation,this study focuses on formulating a coordinated strategy involving the carbon capture unit of the integrated energy system and the resources on the load storage side.A scheduling model is devised that takes into account the confidence interval associated with renewable energy generation,with the overarching goal of optimizing the system for low-carbon operation.To begin with,an in-depth analysis is conducted on the temporal energy-shifting attributes and the low-carbon modulation mechanisms exhibited by the source-side carbon capture power plant within the context of integrated and adaptable operational paradigms.Drawing from this analysis,a model is devised to represent the adjustable resources on the charge-storage side,predicated on the principles of electro-thermal coupling within the energy system.Subsequently,the dissimilarities in the confidence intervals of renewable energy generation are considered,leading to the proposition of a flexible upper threshold for the confidence interval.Building on this,a low-carbon dispatch model is established for the integrated energy system,factoring in the margin allowed by the adjustable resources.In the final phase,a simulation is performed on a regional electric heating integrated energy system.This simulation seeks to assess the impact of source-load-storage coordination on the system’s low-carbon operation across various scenarios of reduction margin reserves.The findings underscore that the proactive scheduling model incorporating confidence interval considerations for reduction margin reserves effectively mitigates the uncertainties tied to renewable energy generation.Through harmonized orchestration of source,load,and storage elements,it expands the utilization scope for renewable energy,safeguards the economic efficiency of system operations under low-carbon emission conditions,and empirically validates the soundness and efficacy of the proposed approach.
文摘Purpose:We aim to extend our investigations related to the Relative Intensity of Collaboration(RIC)indicator,by constructing a confidence interval for the obtained values.Design/methodology/approach:We use Mantel-Haenszel statistics as applied recently by Smolinsky,Klingenberg,and Marx.Findings:We obtain confidence intervals for the RIC indicatorResearch limitations:It is not obvious that data obtained from the Web of Science(or any other database)can be considered a random sample.Practical implications:We explain how to calculate confidence intervals.Bibliometric indicators are more often than not presented as precise values instead of an approximation depending on the database and the time of measurement.Our approach presents a suggestion to solve this problem.Originality/value:Our approach combines the statistics of binary categorical data and bibliometric studies of collaboration.
基金The NSF (10661003) of China,SRF for ROCS,SEM ([2004]527)the NSF (0728092) of GuangxiInnovation Project of Guangxi Graduate Education ([2006]40)
文摘Suppose that there are two populations x and y with missing data on both of them, where x has a distribution function F(·) which is unknown and y has a distribution function Gθ(·) with a probability density function gθ(·) with known form depending on some unknown parameter θ. Fractional imputation is used to fill in missing data. The asymptotic distributions of the semi-empirical likelihood ration statistic are obtained under some mild conditions. Then, empirical likelihood confidence intervals on the differences of x and y are constructed.
文摘This paper provides methods for assessing the precision of cost elasticity estimates when the underlying regression function is assumed to be polynomial. Specifically, the paper adapts two well-known methods for computing confidential intervals for ratios: the delta-method and the Fieller method. We show that performing the estimation with mean-centered explanatory variables provides a straightforward way to estimate the elasticity and compute a confidence interval for it. A theoretical discussion of the proposed methods is provided, as well as an empirical example based on publicly available postal data. Possible areas of application include postal service providers worldwide, transportation and electricity.
文摘This paper presents four methods of constructing the confidence interval for the proportion <i><span style="font-family:Verdana;">p</span></i><span style="font-family:;" "=""><span style="font-family:Verdana;"> of the binomial distribution. Evidence in the literature indicates the standard Wald confidence interval for the binomial proportion is inaccurate, especially for extreme values of </span><i><span style="font-family:Verdana;">p</span></i><span style="font-family:Verdana;">. Even for moderately large sample sizes, the coverage probabilities of the Wald confidence interval prove to be erratic for extreme values of </span><i><span style="font-family:Verdana;">p</span></i><span style="font-family:Verdana;">. Three alternative confidence intervals, namely, Wilson confidence interval, Clopper-Pearson interval, and likelihood interval</span></span><span style="font-family:Verdana;">,</span><span style="font-family:Verdana;"> are compared to the Wald confidence interval on the basis of coverage probability and expected length by means of simulation.</span>
文摘In data envelopment analysis (DEA), input and output values are subject to change for several reasons. Such variations differ in their input/output items and their decision-making units (DMUs). Hence, DEA efficiency scores need to be examined by considering these factors. In this paper, we propose new resampling models based on these variations for gauging the confidence intervals of DEA scores. The first model utilizes past-present data for estimating data variations imposing chronological order weights which are supplied by Lucas series (a variant of Fibonacci series). The second model deals with future prospects. This model aims at forecasting the future efficiency score and its confidence interval for each DMU. We applied our models to a dataset composed of Japanese municipal hospitals.
基金Supported by NSFC grants(Nos.11571349 and 91630314)the National Key R&D Program of China under Grant 2018YFB0704304,NCMIS of CAS,LSC of CAS+1 种基金the Youth Innovation Promotion Association of CAS.JR and FS were supported by US National Science Foundation(NSF)(DMS-1518001)National Institutes of Health(NIH)(R01GM120624,1R01GM131407).
文摘Background:Markov chains(MC)have been widely used to model molecular sequences.The estimations of MC transition matrix and confidence intervals of the transition probabilities from long sequence data have been intensively studied in the past decades.In next generation sequencing(NGS),a large amount of short reads are generated.These short reads can overlap and some regions of the genome may not be sequenced resulting in a new type of data.Based on NGS data,the transition probabilities of MC can be estimated by moment estimators.However,the classical asymptotic distribution theory for MC transition probability estimators based on long sequences is no longer valid.Methods:In this study,we present the asymptotic distributions of several statistics related to MC based on NGS data.We show that,after scaling by the effective coverage d defined in a previous study by the authors,these statistics based on NGS data approximate to the same distributions as the corresponding statistics for long sequences.Results:We apply the asymptotic properties of these statistics for finding the theoretical confidence regions for MC transition probabilities based on NGS short reads data.We validate our theoretical confidence intervals using both simulated data and real data sets,and compare the results with those by the parametric bootstrap method.Conclusions:We find that the asymptotic distributions of these statistics and the theoretical confidence intervals of transition probabilities based on NGS data given in this study are highly accurate,providing a powerful tool for NGS data analysis.
文摘In cancer survival analysis, it is very frequently to estimate the confidence intervals for survival probabilities.But this calculation is not commonly involve in most popular computer packages, or only one methods of estimation in the packages. In the present Paper, we will describe a microcomputer Program for estimating the confidence intervals of survival probabilities, when the survival functions are estimated using Kaplan-Meier product-limit or life-table method. There are five methods of estimation in the program (SPCI), which are the classical(based on Greenwood's formula of variance of S(ti), Rothman-Wilson, arcsin transformation, log(-Iog) transformation, Iogit transformation methods. Two example analysis are given for testing the performances of the program running.
文摘We discuss formulas and techniques for finding maximum-likelihood estimators of parameters of autoregressive (with particular emphasis on Markov and Yule) models, computing their asymptotic variance-covariance matrix and displaying the resulting confidence regions;Monte Carlo simulation is then used to establish the accuracy of the corresponding level of confidence. The results indicate that a direct application of the Central Limit Theorem yields errors too large to be acceptable;instead, we recommend using a technique based directly on the natural logarithm of the likelihood function, verifying its substantially higher accuracy. Our study is then extended to the case of estimating only a subset of a model’s parameters, when the remaining ones (called nuisance) are of no interest to us.
文摘Various random models with balanced data that are relevant for analyzing practical test data are described, along with several hypothesis testing and interval estimation problems concerning variance components. In this paper, we mainly consider these problems in general random effect model with balanced data. Exact tests and confidence intervals for a single variance component corresponding to random effect are developed by using generalized p-values and generalized confidence intervals. The resulting procedures are easy to compute and are applicable to small samples. Exact tests and confidence intervals are also established for comparing the random-effects variance components and the sum of random-effects variance components in two independent general random effect models with balanced data. Furthermore, we investigate the statistical properties of the resulting tests. Finally, some simulation results on the type Ⅰ error probability and power of the proposed test are reported. The simulation results indicate that exact test is extremely satisfactory for controlling type Ⅰ error probability.
基金supported by Hanyang University(Grant No.HY-2014)
文摘Hydrological risk is highly dependent on the occurrence of extreme rainfalls.This fact has led to a wide range of studies on the estimation and uncertainty analysis of the extremes.In most cases,confidence intervals(CIs)are constructed to represent the uncertainty of the estimates.Since the accuracy of CIs depends on the asymptotic normality of the data and is questionable with limited observations in practice,a Bayesian highest posterior density(HPD)interval,bootstrap percentile interval,and profile likelihood(PL)interval have been introduced to analyze the uncertainty that does not depend on the normality assumption.However,comparison studies to investigate their performances in terms of the accuracy and uncertainty of the estimates are scarce.In addition,the strengths,weakness,and conditions necessary for performing each method also must be investigated.Accordingly,in this study,test experiments with simulations from varying parent distributions and different sample sizes were conducted.Then,applications to the annual maximum rainfall(AMR)time series data in South Korea were performed.Five districts with 38-year(1973–2010)AMR observations were fitted by the three aforementioned methods in the application.From both the experimental and application results,the Bayesian method is found to provide the lowest uncertainty of the design level while the PL estimates generally have the highest accuracy but also the largest uncertainty.The bootstrap estimates are usually inferior to the other two methods,but can perform adequately when the distribution model is not heavy-tailed and the sample size is large.The distribution tail behavior and the sample size are clearly found to affect the estimation accuracy and uncertainty.This study presents a comparative result,which can help researchers make decisions in the context of assessing extreme rainfall uncertainties.
基金Supported by the National Natural Science Foundation of China (10661003)the Natural Science Foundation of Guangxi (0728092)
文摘Detecting population (group) differences is useful in many applications, such as medical research. In this paper, we explore the probabilistic theory for identifying the quantile differences .between two populations. Suppose that there are two populations x and y with missing data on both of them, where x is nonparametric and y is parametric. We are interested in constructing confidence intervals on the quantile differences of x and y. Random hot deck imputation is used to fill in missing data. Semi-empirical likelihood confidence intervals on the differences are constructed.
文摘Point-wise confidence intervals for a nonparametric regression function with random design points are considered. The confidence intervals are those based on the traditional normal approximation and the empirical likelihood. Their coverage accuracy is assessed by developing the Edgeworth expansions for the coverage probabilities. It is shown that the empirical likelihood confidence intervals are Bartlett correctable.
基金This research is supported by he National Natural Science Foundation of China under Grant Nos. 10661003 and 10971038, and the Natural Science Foundation of Guangxi under Grant No. 2010GXNSFA013117.
文摘This paper considers two estimators of θ= g(x) in a nonparametric regression model Y = g(x) + ε(x∈ (0, 1)p) with missing responses: Imputation and inverse probability weighted esti- mators. Asymptotic normality of the two estimators is established, which is used to construct normal approximation based confidence intervals on θ.
基金Supported by the National Natural Science Foundation of China(No.10661003)Natural Science Foundation of Guangxi(No.0728092)
文摘Suppose that there are two nonparametric populations x and y with missing data on both of them. We are interested in constructing confidence intervals on the quantile differences of x and y. Random imputation is used. Empirical likelihood confidence intervals on the differences are constructed.
基金supported by the Natural Science Foundation of China under Grant No.70671043
文摘In this paper,Scheffé and Simplified Scheffé simultaneous confidence intervals are firstconstructed for mean difference of several multivariate normal distributions.Then the authors theoreticallyprove that when there are only two populations,Bonferroni bounds and Simplified Scheffébounds are the same and they are shorter than Scheffé bounds for p10.In the case for 3k10and 2p10,there exists n(p,k)such that Bonferroni method is better than Simplified Schefféprocedure for nn(p,k),otherwise Simplified Scheffé procedure is better.Finally,the authors findout that neither of Scheffé critical values nor Simplified Scheffé critical values are always larger thananother through numerical calculation.
文摘Although there are many measures of variability for qualitative variables, they are little used in social research, nor are they included in statistical software. The aim of this article is to present six measures of variation for qualitative variables of simple calculation, as well as to facilitate their use by means of the R software. The measures considered are, on the one hand, Freemans variation ratio, Morals universal variation ratio, Kvalseths standard deviation from the mode, and Wilcoxs variation ratio which are most affected by proximity to a constant random variable, where the measures of variability for qualitative variables reach their minimum value of 0. On the other hand, the Gibbs-Poston index of qualitative variation and Shannons relative entropy are included, which are more affected by the proximity to a uniform distribution, where the measures of variability for qualitative variables reach their maximum value of 1. Point and interval estimation are addressed. Bootstrap by the percentile and bias-corrected and accelerated percentile methods are used to obtain confidence intervals. Two calculation situations are presented: with a sample mode and with two or more modes. The standard deviation from the mode among the six considered measures, and the universal variation ratio among the three variation ratios, are particularly recommended for use.
基金The National Natural Science Foundation of China(No.51108079)
文摘To improve the forecasting reliability of travel time, the time-varying confidence interval of travel time on arterials is forecasted using an autoregressive integrated moving average and generalized autoregressive conditional heteroskedasticity (ARIMA-GARCH) model. In which, the ARIMA model is used as the mean equation of the GARCH model to model the travel time levels and the GARCH model is used to model the conditional variances of travel time. The proposed method is validated and evaluated using actual traffic flow data collected from the traffic monitoring system of Kunshan city. The evaluation results show that, compared with the conventional ARIMA model, the proposed model cannot significantly improve the forecasting performance of travel time levels but has advantage in travel time volatility forecasting. The proposed model can well capture the travel time heteroskedasticity and forecast the time-varying confidence intervals of travel time which can better reflect the volatility of observed travel times than the fixed confidence interval provided by the ARIMA model.
基金supported by the National Natural Science Foundation of China(Grant Nos.52130805 and 52022070)Shanghai Science and Technology Committee Program(Grant No.20dz1202200)。
文摘The random finite difference method(RFDM) is a popular approach to quantitatively evaluate the influence of inherent spatial variability of soil on the deformation of embedded tunnels.However,the high computational cost is an ongoing challenge for its application in complex scenarios.To address this limitation,a deep learning-based method for efficient prediction of tunnel deformation in spatially variable soil is proposed.The proposed method uses one-dimensional convolutional neural network(CNN) to identify the pattern between random field input and factor of safety of tunnel deformation output.The mean squared error and correlation coefficient of the CNN model applied to the newly untrained dataset was less than 0.02 and larger than 0.96,respectively.It means that the trained CNN model can replace RFDM analysis for Monte Carlo simulations with a small but sufficient number of random field samples(about 40 samples for each case in this study).It is well known that the machine learning or deep learning model has a common limitation that the confidence of predicted result is unknown and only a deterministic outcome is given.This calls for an approach to gauge the model’s confidence interval.It is achieved by applying dropout to all layers of the original model to retrain the model and using the dropout technique when performing inference.The excellent agreement between the CNN model prediction and the RFDM calculated results demonstrated that the proposed deep learning-based method has potential for tunnel performance analysis in spatially variable soils.
基金Natural Natural Science Foundation of China Under Grant No 50778077 & 50608036the Graduate Innovation Fund of Huazhong University of Science and Technology Under Grant No HF-06-028
文摘A novel damage detection method is applied to a 3-story frame structure, to obtain statistical quantification control criterion of the existence, location and identification of damage. The mean, standard deviation, and exponentially weighted moving average (EWMA) are applied to detect damage information according to statistical process control (SPC) theory. It is concluded that the detection is insignificant with the mean and EWMA because the structural response is not independent and is not a normal distribution. On the other hand, the damage information is detected well with the standard deviation because the influence of the data distribution is not pronounced with this parameter. A suitable moderate confidence level is explored for more significant damage location and quantification detection, and the impact of noise is investigated to illustrate the robustness of the method.