F distribution is one of the most frequently used distributions in statistics. For example, it is used for testing: equality of variances of two independent normal distributions, equality of means in the one-way ANOVA...F distribution is one of the most frequently used distributions in statistics. For example, it is used for testing: equality of variances of two independent normal distributions, equality of means in the one-way ANOVA setting, overall significance of a normal linear regression model, and so on. In this paper, a simple chi-square approximation for the cumulative distribution of the F-distribution is obtained via an adjusted log-likelihood ratio statistic. This new approximation exhibits remarkable accuracy even when the degrees of freedom of the F distribution are small.展开更多
The various physical mechanisms governing the dynamics of streamflow processes act on a seemingly wide range of temporal and spatial scales;almost all the mechanisms involved present some degree of nonlinearity. Again...The various physical mechanisms governing the dynamics of streamflow processes act on a seemingly wide range of temporal and spatial scales;almost all the mechanisms involved present some degree of nonlinearity. Against the backdrop of these issues, in this paper, attempt was made to critically look at the subject of Autoregressive Conditional Heteroscedasticity (ARCH) or volatility of streamflow processes, a form of nonlinear phenomena. Towards this end, streamflow data (both daily and monthly) of the River Benue, Nigeria were used for the study. Results obtained from the analyses indicate that the existence of conditional heteroscedasticity in streamflow processes is no paradox. Too, ARCH effect is caused by seasonal variation in the variance for monthly flows and could partly explain same in the daily streamflow. It was also evident that the traditional seasonal Autoregressive Moving Average (ARMA) models are inadequate in describing ARCH effect in daily streamflow process though, robust for monthly streamflow;and can be removed if proper deseasonalisation pre-processing was done. Considering the findings, the potential for a hybrid Autoregressive Moving Average (ARMA) and Generalised Autoregressive Conditional Heteroscedasticity (GARCH)type models should be further explored and probably embraced for modelling daily streamflow regime in view of the relevance of statistical modelling in hydrology.展开更多
In a linear regression model, testing for uniformity of the variance of the residuals is a significant integral part of statistical analysis. This is a crucial assumption that requires statistical confirmation via the...In a linear regression model, testing for uniformity of the variance of the residuals is a significant integral part of statistical analysis. This is a crucial assumption that requires statistical confirmation via the use of some statistical tests mostly before carrying out the Analysis of Variance (ANOVA) technique. Many academic researchers have published series of papers (articles) on some tests for detecting variance heterogeneity assumption in multiple linear regression models. So many comparisons on these tests have been made using various statistical techniques like biases, error rates as well as powers. Aside comparisons, modifications of some of these statistical tests for detecting variance heterogeneity have been reported in some literatures in recent years. In a multiple linear regression situation, much work has not been done on comparing some selected statistical tests for homoscedasticity assumption when linear, quadratic, square root, and exponential forms of heteroscedasticity are injected into the residuals. As a result of this fact, the present study intends to work extensively on all these areas of interest with a view to filling the gap. The paper aims at providing a comprehensive comparative analysis of asymptotic behaviour of some selected statistical tests for homoscedasticity assumption in order to hunt for the best statistical test for detecting heteroscedasticity in a multiple linear regression scenario with varying variances and levels of significance. In the literature, several tests for homoscedasticity are available but only nine: Breusch-Godfrey test, studentized Breusch-Pagan test, White’s test, Nonconstant Variance Score test, Park test, Spearman Rank, <span>Glejser test, Goldfeld-Quandt test, Harrison-McCabe test were considered for this study;this is with a view to examining, by Monte Carlo simulations, their</span><span> asymptotic behaviours. However, four different forms of heteroscedastic structures: exponential and linear (generalize of square-root and quadratic structures) were injected into the residual part of the multiple linear regression models at different categories of sample sizes: 30, 50, 100, 200, 500 and 1000. Evaluations of the performances were done within R environment. Among other findings, our investigations revealed that Glejser and Park tests returned the best test to employ to check for heteroscedasticity in EHS and LHS respectively also White and Harrison-McCabe tests returned the best test to employ to check for homoscedasticity in EHS and LHS respectively for sample size less than 50.</span>展开更多
We consider the efficacy of a proposed linear-dimension-reduction method to potentially increase the powers of five hypothesis tests for the difference of two high-dimensional multivariate-normal population-mean vecto...We consider the efficacy of a proposed linear-dimension-reduction method to potentially increase the powers of five hypothesis tests for the difference of two high-dimensional multivariate-normal population-mean vectors with the assumption of homoscedastic covariance matrices. We use Monte Carlo simulations to contrast the empirical powers of the five high-dimensional tests by using both the original data and dimension-reduced data. From the Monte Carlo simulations, we conclude that a test by Thulin [1], when performed with post-dimension-reduced data, yielded the best omnibus power for detecting a difference between two high-dimensional population-mean vectors. We also illustrate the utility of our dimension-reduction method real data consisting of genetic sequences of two groups of patients with Crohn’s disease and ulcerative colitis.展开更多
Today, Linear Mixed Models (LMMs) are fitted, mostly, by assuming that random effects and errors have Gaussian distributions, therefore using Maximum Likelihood (ML) or REML estimation. However, for many data sets, th...Today, Linear Mixed Models (LMMs) are fitted, mostly, by assuming that random effects and errors have Gaussian distributions, therefore using Maximum Likelihood (ML) or REML estimation. However, for many data sets, that double assumption is unlikely to hold, particularly for the random effects, a crucial component </span></span><span style="font-family:Verdana;"><span style="font-family:Verdana;"><span style="font-family:Verdana;">in </span></span></span><span style="font-family:Verdana;"><span style="font-family:Verdana;"><span style="font-family:Verdana;">which assessment of magnitude is key in such modeling. Alternative fitting methods not relying on that assumption (as ANOVA ones and Rao</span></span></span><span style="font-family:Verdana;"><span style="font-family:Verdana;"><span style="font-family:Verdana;">’</span></span></span><span style="font-family:Verdana;"><span style="font-family:Verdana;"><span style="font-family:Verdana;">s MINQUE) apply, quite often, only to the very constrained class of variance components models. In this paper, a new computationally feasible estimation methodology is designed, first for the widely used class of 2-level (or longitudinal) LMMs with only assumption (beyond the usual basic ones) that residual errors are uncorrelated and homoscedastic, with no distributional assumption imposed on the random effects. A major asset of this new approach is that it yields nonnegative variance estimates and covariance matrices estimates which are symmetric and, at least, positive semi-definite. Furthermore, it is shown that when the LMM is, indeed, Gaussian, this new methodology differs from ML just through a slight variation in the denominator of the residual variance estimate. The new methodology actually generalizes to LMMs a well known nonparametric fitting procedure for standard Linear Models. Finally, the methodology is also extended to ANOVA LMMs, generalizing an old method by Henderson for ML estimation in such models under normality.展开更多
文摘F distribution is one of the most frequently used distributions in statistics. For example, it is used for testing: equality of variances of two independent normal distributions, equality of means in the one-way ANOVA setting, overall significance of a normal linear regression model, and so on. In this paper, a simple chi-square approximation for the cumulative distribution of the F-distribution is obtained via an adjusted log-likelihood ratio statistic. This new approximation exhibits remarkable accuracy even when the degrees of freedom of the F distribution are small.
文摘The various physical mechanisms governing the dynamics of streamflow processes act on a seemingly wide range of temporal and spatial scales;almost all the mechanisms involved present some degree of nonlinearity. Against the backdrop of these issues, in this paper, attempt was made to critically look at the subject of Autoregressive Conditional Heteroscedasticity (ARCH) or volatility of streamflow processes, a form of nonlinear phenomena. Towards this end, streamflow data (both daily and monthly) of the River Benue, Nigeria were used for the study. Results obtained from the analyses indicate that the existence of conditional heteroscedasticity in streamflow processes is no paradox. Too, ARCH effect is caused by seasonal variation in the variance for monthly flows and could partly explain same in the daily streamflow. It was also evident that the traditional seasonal Autoregressive Moving Average (ARMA) models are inadequate in describing ARCH effect in daily streamflow process though, robust for monthly streamflow;and can be removed if proper deseasonalisation pre-processing was done. Considering the findings, the potential for a hybrid Autoregressive Moving Average (ARMA) and Generalised Autoregressive Conditional Heteroscedasticity (GARCH)type models should be further explored and probably embraced for modelling daily streamflow regime in view of the relevance of statistical modelling in hydrology.
文摘In a linear regression model, testing for uniformity of the variance of the residuals is a significant integral part of statistical analysis. This is a crucial assumption that requires statistical confirmation via the use of some statistical tests mostly before carrying out the Analysis of Variance (ANOVA) technique. Many academic researchers have published series of papers (articles) on some tests for detecting variance heterogeneity assumption in multiple linear regression models. So many comparisons on these tests have been made using various statistical techniques like biases, error rates as well as powers. Aside comparisons, modifications of some of these statistical tests for detecting variance heterogeneity have been reported in some literatures in recent years. In a multiple linear regression situation, much work has not been done on comparing some selected statistical tests for homoscedasticity assumption when linear, quadratic, square root, and exponential forms of heteroscedasticity are injected into the residuals. As a result of this fact, the present study intends to work extensively on all these areas of interest with a view to filling the gap. The paper aims at providing a comprehensive comparative analysis of asymptotic behaviour of some selected statistical tests for homoscedasticity assumption in order to hunt for the best statistical test for detecting heteroscedasticity in a multiple linear regression scenario with varying variances and levels of significance. In the literature, several tests for homoscedasticity are available but only nine: Breusch-Godfrey test, studentized Breusch-Pagan test, White’s test, Nonconstant Variance Score test, Park test, Spearman Rank, <span>Glejser test, Goldfeld-Quandt test, Harrison-McCabe test were considered for this study;this is with a view to examining, by Monte Carlo simulations, their</span><span> asymptotic behaviours. However, four different forms of heteroscedastic structures: exponential and linear (generalize of square-root and quadratic structures) were injected into the residual part of the multiple linear regression models at different categories of sample sizes: 30, 50, 100, 200, 500 and 1000. Evaluations of the performances were done within R environment. Among other findings, our investigations revealed that Glejser and Park tests returned the best test to employ to check for heteroscedasticity in EHS and LHS respectively also White and Harrison-McCabe tests returned the best test to employ to check for homoscedasticity in EHS and LHS respectively for sample size less than 50.</span>
文摘We consider the efficacy of a proposed linear-dimension-reduction method to potentially increase the powers of five hypothesis tests for the difference of two high-dimensional multivariate-normal population-mean vectors with the assumption of homoscedastic covariance matrices. We use Monte Carlo simulations to contrast the empirical powers of the five high-dimensional tests by using both the original data and dimension-reduced data. From the Monte Carlo simulations, we conclude that a test by Thulin [1], when performed with post-dimension-reduced data, yielded the best omnibus power for detecting a difference between two high-dimensional population-mean vectors. We also illustrate the utility of our dimension-reduction method real data consisting of genetic sequences of two groups of patients with Crohn’s disease and ulcerative colitis.
文摘Today, Linear Mixed Models (LMMs) are fitted, mostly, by assuming that random effects and errors have Gaussian distributions, therefore using Maximum Likelihood (ML) or REML estimation. However, for many data sets, that double assumption is unlikely to hold, particularly for the random effects, a crucial component </span></span><span style="font-family:Verdana;"><span style="font-family:Verdana;"><span style="font-family:Verdana;">in </span></span></span><span style="font-family:Verdana;"><span style="font-family:Verdana;"><span style="font-family:Verdana;">which assessment of magnitude is key in such modeling. Alternative fitting methods not relying on that assumption (as ANOVA ones and Rao</span></span></span><span style="font-family:Verdana;"><span style="font-family:Verdana;"><span style="font-family:Verdana;">’</span></span></span><span style="font-family:Verdana;"><span style="font-family:Verdana;"><span style="font-family:Verdana;">s MINQUE) apply, quite often, only to the very constrained class of variance components models. In this paper, a new computationally feasible estimation methodology is designed, first for the widely used class of 2-level (or longitudinal) LMMs with only assumption (beyond the usual basic ones) that residual errors are uncorrelated and homoscedastic, with no distributional assumption imposed on the random effects. A major asset of this new approach is that it yields nonnegative variance estimates and covariance matrices estimates which are symmetric and, at least, positive semi-definite. Furthermore, it is shown that when the LMM is, indeed, Gaussian, this new methodology differs from ML just through a slight variation in the denominator of the residual variance estimate. The new methodology actually generalizes to LMMs a well known nonparametric fitting procedure for standard Linear Models. Finally, the methodology is also extended to ANOVA LMMs, generalizing an old method by Henderson for ML estimation in such models under normality.