This paper presents a new class of test procedures for two-sample location problem based on subsample quantiles. The class includes Mann-Whitney test as a special case. The asymptotic normality of the class of tests p...This paper presents a new class of test procedures for two-sample location problem based on subsample quantiles. The class includes Mann-Whitney test as a special case. The asymptotic normality of the class of tests proposed is established. The asymptotic relative performance of the proposed class of test with respect to the optimal member of Xie and Priebe (2000) is studied in terms of Pitman efficiency for various underlying distributions.展开更多
The objectives of this paper are to demonstrate the algorithms employed by three statistical software programs (R, Real Statistics using Excel, and SPSS) for calculating the exact two-tailed probability of the Wald-Wo...The objectives of this paper are to demonstrate the algorithms employed by three statistical software programs (R, Real Statistics using Excel, and SPSS) for calculating the exact two-tailed probability of the Wald-Wolfowitz one-sample runs test for randomness, to present a novel approach for computing this probability, and to compare the four procedures by generating samples of 10 and 11 data points, varying the parameters n<sub>0</sub> (number of zeros) and n<sub>1</sub> (number of ones), as well as the number of runs. Fifty-nine samples are created to replicate the behavior of the distribution of the number of runs with 10 and 11 data points. The exact two-tailed probabilities for the four procedures were compared using Friedman’s test. Given the significant difference in central tendency, post-hoc comparisons were conducted using Conover’s test with Benjamini-Yekutielli correction. It is concluded that the procedures of Real Statistics using Excel and R exhibit some inadequacies in the calculation of the exact two-tailed probability, whereas the new proposal and the SPSS procedure are deemed more suitable. The proposed robust algorithm has a more transparent rationale than the SPSS one, albeit being somewhat more conservative. We recommend its implementation for this test and its application to others, such as the binomial and sign test.展开更多
This paper investigates the normality of some real data set obtained from waist measurements of a group of 49 young adults. The quantile - quantile (Q-Q) plot and the analysis of correlation coefficients for the Q-Q...This paper investigates the normality of some real data set obtained from waist measurements of a group of 49 young adults. The quantile - quantile (Q-Q) plot and the analysis of correlation coefficients for the Q-Q plot is used to determine the normality or otherwise of the data set. In this regards, the probabilities of the quantiles were computed, modified and plotted. Thereafter the correlation coefficients for the quantile - quantile plots were obtained. Results indicate that at 0.1 level of significance, the data for young adult males of the sample were not normally distributed, and had a mean value that is within the range of low risk, healthwise, whereas the distribution of the data for young female adults showed reasonable normality, but also with a mean value that is within the range of low risk in terms of health condition.展开更多
This paper is focused on the goodness-of-fit test of the functional linear composite quantile regression model.A nonparametric test is proposed by using the orthogonality of the residual and its conditional expectatio...This paper is focused on the goodness-of-fit test of the functional linear composite quantile regression model.A nonparametric test is proposed by using the orthogonality of the residual and its conditional expectation under the null model.The proposed test statistic has an asymptotic standard normal distribution under the null hypothesis,and tends to infinity in probability under the alternative hypothesis,which implies the consistency of the test.Furthermore,it is proved that the test statistic converges to a normal distribution with nonzero mean under a local alternative hypothesis.Extensive simulations are reported,and the results show that the proposed test has proper sizes and is sensitive to the considered model discrepancies.The proposed methods are also applied to two real datasets.展开更多
The objective of this study is to propose the Parametric Seven-Number Summary (PSNS) as a significance test for normality and to verify its accuracy and power in comparison with two well-known tests, such as Royston’...The objective of this study is to propose the Parametric Seven-Number Summary (PSNS) as a significance test for normality and to verify its accuracy and power in comparison with two well-known tests, such as Royston’s W test and D’Agostino-Belanger-D’Agostino K-squared test. An experiment with 384 conditions was simulated. The conditions were generated by crossing 24 sample sizes and 16 types of continuous distributions: one normal and 15 non-normal. The percentage of success in maintaining the null hypothesis of normality against normal samples and in rejecting the null hypothesis against non-normal samples (accuracy) was calculated. In addition, the type II error against normal samples and the statistical power against normal samples were computed. Comparisons of percentage and means were performed using Cochran’s Q-test, Friedman’s test, and repeated measures analysis of variance. With sample sizes of 150 or greater, high accuracy and mean power or type II error (≥0.70 and ≥0.80, respectively) were achieved. All three normality tests were similarly accurate;however, the PSNS-based test showed lower mean power than K-squared and W tests, especially against non-normal samples of symmetrical-platykurtic distributions, such as the uniform, semicircle, and arcsine distributions. It is concluded that the PSNS-based omnibus test is accurate and powerful for testing normality with samples of at least 150 observations.展开更多
Zero-inflated distributions are common in statistical problems where there is interest in testing homogeneity of two or more independent groups. Often, the underlying distribution that has an inflated number of zero-v...Zero-inflated distributions are common in statistical problems where there is interest in testing homogeneity of two or more independent groups. Often, the underlying distribution that has an inflated number of zero-valued observations is asymmetric, and its functional form may not be known or easily characterized. In this case, comparisons of the groups in terms of their respective percentiles may be appropriate as these estimates are nonparametric and more robust to outliers and other irregularities. The median test is often used to compare distributions with similar but asymmetric shapes but may be uninformative when there are excess zeros or dissimilar shapes. For zero-inflated distributions, it is useful to compare the distributions with respect to their proportion of zeros, coupled with the comparison of percentile profiles for the observed non-zero values. A simple chi-square test for simultaneous testing of these two components is proposed, applicable to both continuous and discrete data. Results of simulation studies are reported to summarize empirical power under several scenarios. We give recommendations for the minimum sample size which is necessary to achieve suitable test performance in specific examples.展开更多
为做好ECMWF(European Centre for Medium-Range Weather Forecasting)模式本地化释用,提高四川省降水预报准确率,对四川省2020—2021年7—9月模式各量级降水预报系统性偏差规律分析发现,该模式预报的雨日较实况偏多,尤其是攀西地区和...为做好ECMWF(European Centre for Medium-Range Weather Forecasting)模式本地化释用,提高四川省降水预报准确率,对四川省2020—2021年7—9月模式各量级降水预报系统性偏差规律分析发现,该模式预报的雨日较实况偏多,尤其是攀西地区和川西高原;预报的大雨日数盆地西南部及攀西地区多于实况,而盆地南部少于实况。然后,基于分位数映射法对模式预报的24 h累积降水开展大量级降水订正试验与检验。基于分位数映射法订正后,暴雨及以上量级TS(Threat Score)提高7%~15%,且各量级降水TS均高于多模式集成客观预报产品2%~4%,大雨及以上、暴雨及以上量级命中率提高10%~20%,订正后雨带位置特别是暴雨落区与实况更接近。展开更多
Unlike height-diameter equations for standing trees commonly used in forest resources modelling,tree height models for cut-to-length(CTL)stems tend to produce prediction errors whose distributions are not conditionall...Unlike height-diameter equations for standing trees commonly used in forest resources modelling,tree height models for cut-to-length(CTL)stems tend to produce prediction errors whose distributions are not conditionally normal but are rather leptokurtic and heavy-tailed.This feature was merely noticed in previous studies but never thoroughly investigated.This study characterized the prediction error distribution of a newly developed such tree height model for Pin us radiata(D.Don)through the three-parameter Burr TypeⅫ(BⅫ)distribution.The model’s prediction errors(ε)exhibited heteroskedasticity conditional mainly on the small end relative diameter of the top log and also on DBH to a minor extent.Structured serial correlations were also present in the data.A total of 14 candidate weighting functions were compared to select the best two for weightingεin order to reduce its conditional heteroskedasticity.The weighted prediction errors(εw)were shifted by a constant to the positive range supported by the BXII distribution.Then the distribution of weighted and shifted prediction errors(εw+)was characterized by the BⅫdistribution using maximum likelihood estimation through 1000 times of repeated random sampling,fitting and goodness-of-fit testing,each time by randomly taking only one observation from each tree to circumvent the potential adverse impact of serial correlation in the data on parameter estimation and inferences.The nonparametric two sample Kolmogorov-Smirnov(KS)goodness-of-fit test and its closely related Kuiper’s(KU)test showed the fitted BⅫdistributions provided a good fit to the highly leptokurtic and heavy-tailed distribution ofε.Random samples generated from the fitted BⅫdistributions ofεw+derived from using the best two weighting functions,when back-shifted and unweighted,exhibited distributions that were,in about97 and 95%of the 1000 cases respectively,not statistically different from the distribution ofε.Our results for cut-tolength P.radiata stems represented the first case of any tree species where a non-normal error distribution in tree height prediction was described by an underlying probability distribution.The fitted BXII prediction error distribution will help to unlock the full potential of the new tree height model in forest resources modelling of P.radiata plantations,particularly when uncertainty assessments,statistical inferences and error propagations are needed in research and practical applications through harvester data analytics.展开更多
在文献中,分位点回归模型是线性的,但是在实际中,这个假设不能很好地满足需要.为此提出了分位点回归的门限模型,用该模型实证分析了单只股票(浦东发展银行)的条件 VaR.选择了一种流动性风险指标作为条件,因此该条件 VaR 也可以看作是流...在文献中,分位点回归模型是线性的,但是在实际中,这个假设不能很好地满足需要.为此提出了分位点回归的门限模型,用该模型实证分析了单只股票(浦东发展银行)的条件 VaR.选择了一种流动性风险指标作为条件,因此该条件 VaR 也可以看作是流动性调整的 VaR(La-VaR).经过实证分析发现,由门限分位点模型得到的结果能够更好地描述实际市场情况,也能更好地预测市场风险.展开更多
文摘This paper presents a new class of test procedures for two-sample location problem based on subsample quantiles. The class includes Mann-Whitney test as a special case. The asymptotic normality of the class of tests proposed is established. The asymptotic relative performance of the proposed class of test with respect to the optimal member of Xie and Priebe (2000) is studied in terms of Pitman efficiency for various underlying distributions.
文摘The objectives of this paper are to demonstrate the algorithms employed by three statistical software programs (R, Real Statistics using Excel, and SPSS) for calculating the exact two-tailed probability of the Wald-Wolfowitz one-sample runs test for randomness, to present a novel approach for computing this probability, and to compare the four procedures by generating samples of 10 and 11 data points, varying the parameters n<sub>0</sub> (number of zeros) and n<sub>1</sub> (number of ones), as well as the number of runs. Fifty-nine samples are created to replicate the behavior of the distribution of the number of runs with 10 and 11 data points. The exact two-tailed probabilities for the four procedures were compared using Friedman’s test. Given the significant difference in central tendency, post-hoc comparisons were conducted using Conover’s test with Benjamini-Yekutielli correction. It is concluded that the procedures of Real Statistics using Excel and R exhibit some inadequacies in the calculation of the exact two-tailed probability, whereas the new proposal and the SPSS procedure are deemed more suitable. The proposed robust algorithm has a more transparent rationale than the SPSS one, albeit being somewhat more conservative. We recommend its implementation for this test and its application to others, such as the binomial and sign test.
文摘This paper investigates the normality of some real data set obtained from waist measurements of a group of 49 young adults. The quantile - quantile (Q-Q) plot and the analysis of correlation coefficients for the Q-Q plot is used to determine the normality or otherwise of the data set. In this regards, the probabilities of the quantiles were computed, modified and plotted. Thereafter the correlation coefficients for the quantile - quantile plots were obtained. Results indicate that at 0.1 level of significance, the data for young adult males of the sample were not normally distributed, and had a mean value that is within the range of low risk, healthwise, whereas the distribution of the data for young female adults showed reasonable normality, but also with a mean value that is within the range of low risk in terms of health condition.
基金supported by the Natural Science Foundation of China under Grant Nos.11271014 and 11971045。
文摘This paper is focused on the goodness-of-fit test of the functional linear composite quantile regression model.A nonparametric test is proposed by using the orthogonality of the residual and its conditional expectation under the null model.The proposed test statistic has an asymptotic standard normal distribution under the null hypothesis,and tends to infinity in probability under the alternative hypothesis,which implies the consistency of the test.Furthermore,it is proved that the test statistic converges to a normal distribution with nonzero mean under a local alternative hypothesis.Extensive simulations are reported,and the results show that the proposed test has proper sizes and is sensitive to the considered model discrepancies.The proposed methods are also applied to two real datasets.
文摘The objective of this study is to propose the Parametric Seven-Number Summary (PSNS) as a significance test for normality and to verify its accuracy and power in comparison with two well-known tests, such as Royston’s W test and D’Agostino-Belanger-D’Agostino K-squared test. An experiment with 384 conditions was simulated. The conditions were generated by crossing 24 sample sizes and 16 types of continuous distributions: one normal and 15 non-normal. The percentage of success in maintaining the null hypothesis of normality against normal samples and in rejecting the null hypothesis against non-normal samples (accuracy) was calculated. In addition, the type II error against normal samples and the statistical power against normal samples were computed. Comparisons of percentage and means were performed using Cochran’s Q-test, Friedman’s test, and repeated measures analysis of variance. With sample sizes of 150 or greater, high accuracy and mean power or type II error (≥0.70 and ≥0.80, respectively) were achieved. All three normality tests were similarly accurate;however, the PSNS-based test showed lower mean power than K-squared and W tests, especially against non-normal samples of symmetrical-platykurtic distributions, such as the uniform, semicircle, and arcsine distributions. It is concluded that the PSNS-based omnibus test is accurate and powerful for testing normality with samples of at least 150 observations.
文摘Zero-inflated distributions are common in statistical problems where there is interest in testing homogeneity of two or more independent groups. Often, the underlying distribution that has an inflated number of zero-valued observations is asymmetric, and its functional form may not be known or easily characterized. In this case, comparisons of the groups in terms of their respective percentiles may be appropriate as these estimates are nonparametric and more robust to outliers and other irregularities. The median test is often used to compare distributions with similar but asymmetric shapes but may be uninformative when there are excess zeros or dissimilar shapes. For zero-inflated distributions, it is useful to compare the distributions with respect to their proportion of zeros, coupled with the comparison of percentile profiles for the observed non-zero values. A simple chi-square test for simultaneous testing of these two components is proposed, applicable to both continuous and discrete data. Results of simulation studies are reported to summarize empirical power under several scenarios. We give recommendations for the minimum sample size which is necessary to achieve suitable test performance in specific examples.
文摘为做好ECMWF(European Centre for Medium-Range Weather Forecasting)模式本地化释用,提高四川省降水预报准确率,对四川省2020—2021年7—9月模式各量级降水预报系统性偏差规律分析发现,该模式预报的雨日较实况偏多,尤其是攀西地区和川西高原;预报的大雨日数盆地西南部及攀西地区多于实况,而盆地南部少于实况。然后,基于分位数映射法对模式预报的24 h累积降水开展大量级降水订正试验与检验。基于分位数映射法订正后,暴雨及以上量级TS(Threat Score)提高7%~15%,且各量级降水TS均高于多模式集成客观预报产品2%~4%,大雨及以上、暴雨及以上量级命中率提高10%~20%,订正后雨带位置特别是暴雨落区与实况更接近。
文摘Unlike height-diameter equations for standing trees commonly used in forest resources modelling,tree height models for cut-to-length(CTL)stems tend to produce prediction errors whose distributions are not conditionally normal but are rather leptokurtic and heavy-tailed.This feature was merely noticed in previous studies but never thoroughly investigated.This study characterized the prediction error distribution of a newly developed such tree height model for Pin us radiata(D.Don)through the three-parameter Burr TypeⅫ(BⅫ)distribution.The model’s prediction errors(ε)exhibited heteroskedasticity conditional mainly on the small end relative diameter of the top log and also on DBH to a minor extent.Structured serial correlations were also present in the data.A total of 14 candidate weighting functions were compared to select the best two for weightingεin order to reduce its conditional heteroskedasticity.The weighted prediction errors(εw)were shifted by a constant to the positive range supported by the BXII distribution.Then the distribution of weighted and shifted prediction errors(εw+)was characterized by the BⅫdistribution using maximum likelihood estimation through 1000 times of repeated random sampling,fitting and goodness-of-fit testing,each time by randomly taking only one observation from each tree to circumvent the potential adverse impact of serial correlation in the data on parameter estimation and inferences.The nonparametric two sample Kolmogorov-Smirnov(KS)goodness-of-fit test and its closely related Kuiper’s(KU)test showed the fitted BⅫdistributions provided a good fit to the highly leptokurtic and heavy-tailed distribution ofε.Random samples generated from the fitted BⅫdistributions ofεw+derived from using the best two weighting functions,when back-shifted and unweighted,exhibited distributions that were,in about97 and 95%of the 1000 cases respectively,not statistically different from the distribution ofε.Our results for cut-tolength P.radiata stems represented the first case of any tree species where a non-normal error distribution in tree height prediction was described by an underlying probability distribution.The fitted BXII prediction error distribution will help to unlock the full potential of the new tree height model in forest resources modelling of P.radiata plantations,particularly when uncertainty assessments,statistical inferences and error propagations are needed in research and practical applications through harvester data analytics.
文摘在文献中,分位点回归模型是线性的,但是在实际中,这个假设不能很好地满足需要.为此提出了分位点回归的门限模型,用该模型实证分析了单只股票(浦东发展银行)的条件 VaR.选择了一种流动性风险指标作为条件,因此该条件 VaR 也可以看作是流动性调整的 VaR(La-VaR).经过实证分析发现,由门限分位点模型得到的结果能够更好地描述实际市场情况,也能更好地预测市场风险.