Let - be i.i.d. random variables taking values in a measurable space ( Χ, B ). Let φ1: Χ →□ and φ: Χ2→□ be measurable functions. Assume that φ is symmetric, i.e. φ(x,y)=φ(y.x), for any x,y∈Χ . Consider U...Let - be i.i.d. random variables taking values in a measurable space ( Χ, B ). Let φ1: Χ →□ and φ: Χ2→□ be measurable functions. Assume that φ is symmetric, i.e. φ(x,y)=φ(y.x), for any x,y∈Χ . Consider U-statistic, assuming that Eφ1(Χ)=0, Eφ(x, X)=0 for all x∈X, Eφ2(x,X)<∞, Eφ21(X)<∞. We will provide bounds for ΔN=supx|F(x)-F0(x)-F1(x)|, where F is a distribution function of T and F0 , F1 are its limiting distribution function and Edgeworth correction respectively. Applications of these results are also provided for von Mises statistics case.展开更多
The author discusses Bernstein type inequalities for degenerate U-statistics.As applications of these results, Cramer type large deviations for studentized U-statistics are obtained under mild conditions.
In this article, the general central limit theorem and Berry-Esseen bounds for finite-populationU-statistics with degree m are established under the very weak conditions, These resultssubstantially improve those of Zh...In this article, the general central limit theorem and Berry-Esseen bounds for finite-populationU-statistics with degree m are established under the very weak conditions, These resultssubstantially improve those of Zhao and Chen's (1987).展开更多
Weighted U-statistics and generalized L-statistics are commonly used in statistical inference and their asymptotic properties have been well developed.In this paper sharp non-uniform Berry–Esseen bounds for weighted ...Weighted U-statistics and generalized L-statistics are commonly used in statistical inference and their asymptotic properties have been well developed.In this paper sharp non-uniform Berry–Esseen bounds for weighted U-statistics and generalized L-statistic are established.展开更多
In this paper, we obtain an almost sure central limit theorem for products of independent sums of positive random variables. An extension of the result gives an ASCLT for the U-statistics.
In this paper, we investigate the two sample U-statistics by jackknife empirical likelihood(JEL),a versatile nonparametric approach. More precisely, we propose the method of balanced augmented jackknife empirical like...In this paper, we investigate the two sample U-statistics by jackknife empirical likelihood(JEL),a versatile nonparametric approach. More precisely, we propose the method of balanced augmented jackknife empirical likelihood(BAJEL) by adding two artificial points to the original pseudo-value dataset, and we prove that the log likelihood ratio based on the expanded dataset tends to the χ~2 distribution.展开更多
Let (X, Xn; n≥ 1} be a sequence of i.i.d, random variables with values in a measurable space (S,8) such that E|h(X1, X2,..., Xm)| 〈 ∞, where h is a measurable symmetric function from Sm into R = (-∞, ∞)....Let (X, Xn; n≥ 1} be a sequence of i.i.d, random variables with values in a measurable space (S,8) such that E|h(X1, X2,..., Xm)| 〈 ∞, where h is a measurable symmetric function from Sm into R = (-∞, ∞). Let {wn,i1,i2 im ; 1 ≤ i1 〈 i2 〈 …… 〈im 〈 n, n ≥ m} be a matrix array of real numbers. Motivated by a result of Choi and Sung (1987), in this note we are concerned with establishing a strong law of large numbers for weighted U-statistics with kernel h of degree m. We show that whenever SUP n≥m max1〈i1〈i2〈…〈im≤|wn i1,i2 i,im| 〈∞, where 0 = Eh(X1, X2,..., Xm). The proof of this result is based on a new general result on complete convergence, which is a fundamental tool, for array of real-valued random variables under some mild conditions.展开更多
Quantitative descriptions of geochemical patterns and providing geochemical anomaly map are important in applied geochemistry. Several statistical methodologies are presented in order to identify and separate geochemi...Quantitative descriptions of geochemical patterns and providing geochemical anomaly map are important in applied geochemistry. Several statistical methodologies are presented in order to identify and separate geochemical anomalies. The U-statistic method is one of the most important structural methods and is a kind of weighted mean that surrounding points of samples are considered in U value determination. However, it is able to separate the different anomalies based on only one variable. The main aim of the presented study is development of this method in a multivariate mode. For this purpose, U-statistic method should be combined with a multivariate method which devotes a new value to each sample based on several variables. Therefore, at the first step, the optimum p is calculated in p-norm distance and then U-statistic method is applied on p-norm distance values of the samples because p-norm distance is calculated based on several variables. This method is a combination of efficient U-statistic method and p-norm distance and is used for the first time in this research. Results show that p-norm distance of p=2(Euclidean distance) in the case of a fact that Au and As can be considered optimized p-norm distance with the lowest error. The samples indicated by the combination of these methods as anomalous are more regular, less dispersed and more accurate than using just the U-statistic or other nonstructural methods such as Mahalanobis distance. Also it was observed that the combination results are closely associated with the defined Au ore indication within the studied area. Finally, univariate and bivariate geochemical anomaly maps are provided for Au and As, which have been respectively prepared using U-statistic and its combination with Euclidean distance method.展开更多
There are a few statistics testing the homogeneity of odds ratios across strata. Asymptotic statistics lose their power in the “sparse-data” setting. Both asymptotic statistics and exact tests have low power when th...There are a few statistics testing the homogeneity of odds ratios across strata. Asymptotic statistics lose their power in the “sparse-data” setting. Both asymptotic statistics and exact tests have low power when the sample sizes are small. We created a set of U statistics and compared them with some existing statistics in testing homogeneity of OR at different data settings. We evaluated their performance in terms of the empirical size and power via Monto Carlo simulations. Our results showed that two of the U-statistics under our study had higher power for testing homogeneity of odds ratios for 2 by 2 contingency tables. The application of the tests was illustrated in two real examples.展开更多
The past two decades have witnessed the active development of a rich probability theory of Studentized statistics or self-normalized processes, typified by Student’s t-statistic as introduced by W. S. Gosset more tha...The past two decades have witnessed the active development of a rich probability theory of Studentized statistics or self-normalized processes, typified by Student’s t-statistic as introduced by W. S. Gosset more than a century ago, and their applications to statistical problems in high dimensions, including feature selection and ranking, large-scale multiple testing and sparse, high dimensional signal detection. Many of these applications rely on the robustness property of Studentization/self-normalization against heavy-tailed sampling distributions. This paper gives an overview of the salient progress of self-normalized limit theory, from Student’s t-statistic to more general Studentized nonlinear statistics. Prototypical examples include Studentized one- and two-sample U-statistics. Furthermore, we go beyond independence and glimpse some very recent advances in self-normalized moderate deviations under dependence.展开更多
This paper presents a new class of test procedures for two-sample location problem based on subsample quantiles. The class includes Mann-Whitney test as a special case. The asymptotic normality of the class of tests p...This paper presents a new class of test procedures for two-sample location problem based on subsample quantiles. The class includes Mann-Whitney test as a special case. The asymptotic normality of the class of tests proposed is established. The asymptotic relative performance of the proposed class of test with respect to the optimal member of Xie and Priebe (2000) is studied in terms of Pitman efficiency for various underlying distributions.展开更多
In this paper, a problem of testing whether one life distribution possesses ‘more IFR’ property than the other is considered. A new test procedure is proposed and the distribution of the test statistic is studied. T...In this paper, a problem of testing whether one life distribution possesses ‘more IFR’ property than the other is considered. A new test procedure is proposed and the distribution of the test statistic is studied. The performance of the procedure is evaluated in terms of Pitman asymptotic relative efficiency. The consistency property of the test procedure is established. It is observed that the new procedure is better than the existing procedure in the literature.展开更多
We propose a two-sample test for the mean functions of functional data when the number of bases is much lager than the sample size.The novel test is based on U-statistics which avoids estimating the covariance operato...We propose a two-sample test for the mean functions of functional data when the number of bases is much lager than the sample size.The novel test is based on U-statistics which avoids estimating the covariance operator accurately under the high dimensional situation.We further prove the asymptotic normality of our test statistic under both null hypothesis and a local alternative hypothesis.An extensive simulation study is presented which shows that the proposed test works well in comparison with several other methods under the high dimensional situation.An application to egg-laying trajectories of Mediterranean fruit flies data set demonstrates the applicability of the method.展开更多
Let X 1, ..., X n be independent and identically distributed random variables and W n = W n (X 1, ..., X n ) be an estimator of parameter ?. Denote T n = (W n ? ? 0)/s n , where s n 2 is a variance estimator of W n . ...Let X 1, ..., X n be independent and identically distributed random variables and W n = W n (X 1, ..., X n ) be an estimator of parameter ?. Denote T n = (W n ? ? 0)/s n , where s n 2 is a variance estimator of W n . In this paper a general result on the limiting distributions of the non-central studentized statistic T n is given. Especially, when s n 2 is the jacknife estimate of variance, it is shown that the limit could be normal, a weighted χ 2 distribution, a stable distribution, or a mixture of normal and stable distribution. Applications to the power of the studentized U- and L- tests are also discussed.展开更多
Single index models are widely used in medicine, econometrics and some other fields. In this paper, we consider the inference of a change point problem in single index models. Based on density-weighted average derivat...Single index models are widely used in medicine, econometrics and some other fields. In this paper, we consider the inference of a change point problem in single index models. Based on density-weighted average derivative estimation (ADE) method, we propose a statistic to test whether a change point exists or not. The null distribution of the test statistic is obtained using a permutation technique. The permuted statistic is rigorously shown to have the same distribution in the limiting sense under both null and alternative hypotheses. After the null hypothesis of no change point is rejected, an ADE-based estimate of the change point is proposed under assumption that the change point is unique. A simulation study confirms the theoretical results.展开更多
In this study,we propose nonparametric testing for heteroscedasticity in nonlinear regression models based on pairwise distances between points in a sample.The test statistic can be formulated such that Ustatistic the...In this study,we propose nonparametric testing for heteroscedasticity in nonlinear regression models based on pairwise distances between points in a sample.The test statistic can be formulated such that Ustatistic theory can be applied to it.Although the limiting null distribution of the statistic is complicated,we can derive a computationally feasible bootstrap approximation for such a distribution;the validity of the introduced bootstrap algorithm is proven.The test can detect any local alternatives that are different from the null at a nearly optimal rate in hypothesis testing.The convergence rate of this test statistic does not depend on the dimension of the covariates,which significantly alleviates the impact of dimensionality.We provide three simulation studies and a real-data example to evaluate the performance of the test and demonstrate its applications.展开更多
Several tests for multivariate mean vector have been proposed in the recent literature.Generally,these tests are directly concerned with the mean vector of a high-dimensional distribution.The paper presents two new te...Several tests for multivariate mean vector have been proposed in the recent literature.Generally,these tests are directly concerned with the mean vector of a high-dimensional distribution.The paper presents two new test procedures for testing mean vector in large dimension and small samples.We do not focus on the mean vector directly,which is a different framework from the existing choices.The first test procedure is based on the asymptotic distribution of the test statistic,where the dimension increases with the sample size.The second test procedure is based on the permutation distribution of the test statistic,where the sample size is fixed and the dimension grows to infinity.Simulations are carried out to examine the finite-sample performance of the tests and to compare them with some popular nonparametric tests available in the literature.展开更多
Let F_n be the Kaplan-Meier estimator of distribution function F. Let J(·) be a measureable real-valued function. In this paper, a U-statistic representation for the Kaplan-Meier L-estimator, T(F_n)=∫xJ( _n(x))d...Let F_n be the Kaplan-Meier estimator of distribution function F. Let J(·) be a measureable real-valued function. In this paper, a U-statistic representation for the Kaplan-Meier L-estimator, T(F_n)=∫xJ( _n(x))d _n(x), is derived. Furthermore the representation is also used to establish a Berry-Essen inequality for T( _n).展开更多
文摘Let - be i.i.d. random variables taking values in a measurable space ( Χ, B ). Let φ1: Χ →□ and φ: Χ2→□ be measurable functions. Assume that φ is symmetric, i.e. φ(x,y)=φ(y.x), for any x,y∈Χ . Consider U-statistic, assuming that Eφ1(Χ)=0, Eφ(x, X)=0 for all x∈X, Eφ2(x,X)<∞, Eφ21(X)<∞. We will provide bounds for ΔN=supx|F(x)-F0(x)-F1(x)|, where F is a distribution function of T and F0 , F1 are its limiting distribution function and Edgeworth correction respectively. Applications of these results are also provided for von Mises statistics case.
文摘The author discusses Bernstein type inequalities for degenerate U-statistics.As applications of these results, Cramer type large deviations for studentized U-statistics are obtained under mild conditions.
基金This work is also supported in part by the National Natural Science Foundation of China
文摘In this article, the general central limit theorem and Berry-Esseen bounds for finite-populationU-statistics with degree m are established under the very weak conditions, These resultssubstantially improve those of Zhao and Chen's (1987).
基金The research of Q.-M.Shao is partly supported by Hong Kong RGC GRF 603710,2130344.
文摘Weighted U-statistics and generalized L-statistics are commonly used in statistical inference and their asymptotic properties have been well developed.In this paper sharp non-uniform Berry–Esseen bounds for weighted U-statistics and generalized L-statistic are established.
基金Supported by National Natural Science Foundation of China (Grant Nos. 10971081 and 11101180)Basic Research Foundation of Jilin University (Grant Nos. 201001002 and 201103204)
文摘In this paper, we obtain an almost sure central limit theorem for products of independent sums of positive random variables. An extension of the result gives an ASCLT for the U-statistics.
基金supported by the Natural Science Foundation of Guangdong Province(Grant No.2016A030307019)the Higher Education Colleges and Universities Innovation Strong School Project of Guangdong Province(Grant No.2016KTSCX153)+2 种基金Science and Technology Development Fund of Macao(Grant No.127/2016/A3)National Natural Science Foundation of China(Grant No.11401607)a grant at the National University of Singapore(Grant No.R-155-000-181-114)
文摘In this paper, we investigate the two sample U-statistics by jackknife empirical likelihood(JEL),a versatile nonparametric approach. More precisely, we propose the method of balanced augmented jackknife empirical likelihood(BAJEL) by adding two artificial points to the original pseudo-value dataset, and we prove that the log likelihood ratio based on the expanded dataset tends to the χ~2 distribution.
基金The first author is supported by Basic Science Research Program through the National Research Foundationof Korea funded by the Ministry of Education,Science,and Technology(Grant No.2011-0013791)the secondauthor is partially supported by a grant from the Natural Sciences and Engineering Research Council of Canadathe third author is partially supported by a grant from the Natural Sciences and Engineering Research Councilof Canada
文摘Let (X, Xn; n≥ 1} be a sequence of i.i.d, random variables with values in a measurable space (S,8) such that E|h(X1, X2,..., Xm)| 〈 ∞, where h is a measurable symmetric function from Sm into R = (-∞, ∞). Let {wn,i1,i2 im ; 1 ≤ i1 〈 i2 〈 …… 〈im 〈 n, n ≥ m} be a matrix array of real numbers. Motivated by a result of Choi and Sung (1987), in this note we are concerned with establishing a strong law of large numbers for weighted U-statistics with kernel h of degree m. We show that whenever SUP n≥m max1〈i1〈i2〈…〈im≤|wn i1,i2 i,im| 〈∞, where 0 = Eh(X1, X2,..., Xm). The proof of this result is based on a new general result on complete convergence, which is a fundamental tool, for array of real-valued random variables under some mild conditions.
文摘Quantitative descriptions of geochemical patterns and providing geochemical anomaly map are important in applied geochemistry. Several statistical methodologies are presented in order to identify and separate geochemical anomalies. The U-statistic method is one of the most important structural methods and is a kind of weighted mean that surrounding points of samples are considered in U value determination. However, it is able to separate the different anomalies based on only one variable. The main aim of the presented study is development of this method in a multivariate mode. For this purpose, U-statistic method should be combined with a multivariate method which devotes a new value to each sample based on several variables. Therefore, at the first step, the optimum p is calculated in p-norm distance and then U-statistic method is applied on p-norm distance values of the samples because p-norm distance is calculated based on several variables. This method is a combination of efficient U-statistic method and p-norm distance and is used for the first time in this research. Results show that p-norm distance of p=2(Euclidean distance) in the case of a fact that Au and As can be considered optimized p-norm distance with the lowest error. The samples indicated by the combination of these methods as anomalous are more regular, less dispersed and more accurate than using just the U-statistic or other nonstructural methods such as Mahalanobis distance. Also it was observed that the combination results are closely associated with the defined Au ore indication within the studied area. Finally, univariate and bivariate geochemical anomaly maps are provided for Au and As, which have been respectively prepared using U-statistic and its combination with Euclidean distance method.
文摘There are a few statistics testing the homogeneity of odds ratios across strata. Asymptotic statistics lose their power in the “sparse-data” setting. Both asymptotic statistics and exact tests have low power when the sample sizes are small. We created a set of U statistics and compared them with some existing statistics in testing homogeneity of OR at different data settings. We evaluated their performance in terms of the empirical size and power via Monto Carlo simulations. Our results showed that two of the U-statistics under our study had higher power for testing homogeneity of odds ratios for 2 by 2 contingency tables. The application of the tests was illustrated in two real examples.
文摘The past two decades have witnessed the active development of a rich probability theory of Studentized statistics or self-normalized processes, typified by Student’s t-statistic as introduced by W. S. Gosset more than a century ago, and their applications to statistical problems in high dimensions, including feature selection and ranking, large-scale multiple testing and sparse, high dimensional signal detection. Many of these applications rely on the robustness property of Studentization/self-normalization against heavy-tailed sampling distributions. This paper gives an overview of the salient progress of self-normalized limit theory, from Student’s t-statistic to more general Studentized nonlinear statistics. Prototypical examples include Studentized one- and two-sample U-statistics. Furthermore, we go beyond independence and glimpse some very recent advances in self-normalized moderate deviations under dependence.
文摘This paper presents a new class of test procedures for two-sample location problem based on subsample quantiles. The class includes Mann-Whitney test as a special case. The asymptotic normality of the class of tests proposed is established. The asymptotic relative performance of the proposed class of test with respect to the optimal member of Xie and Priebe (2000) is studied in terms of Pitman efficiency for various underlying distributions.
文摘In this paper, a problem of testing whether one life distribution possesses ‘more IFR’ property than the other is considered. A new test procedure is proposed and the distribution of the test statistic is studied. The performance of the procedure is evaluated in terms of Pitman asymptotic relative efficiency. The consistency property of the test procedure is established. It is observed that the new procedure is better than the existing procedure in the literature.
基金Supported by the National Natural Science Foundation of China(Grant Nos.11671268 and 12271370)the Guangdong Basic and Applied Basic Research Foundation(Grant No.2020A1515010821)+1 种基金the Fundamental Research Funds for the Central Universities(Grant No.12619624)Supported by the Research Start-up Fund for new young Teachers of Capital University of Economics and Business(Grant No.00592254417068)。
文摘We propose a two-sample test for the mean functions of functional data when the number of bases is much lager than the sample size.The novel test is based on U-statistics which avoids estimating the covariance operator accurately under the high dimensional situation.We further prove the asymptotic normality of our test statistic under both null hypothesis and a local alternative hypothesis.An extensive simulation study is presented which shows that the proposed test works well in comparison with several other methods under the high dimensional situation.An application to egg-laying trajectories of Mediterranean fruit flies data set demonstrates the applicability of the method.
基金supported in part by Hong Kong UST (Grant No. DAG05/06.SC)Hong Kong RGC CERG(Grant No. 602206)+1 种基金supported by National Natural Science Foundation (Grant No.10801118)the PhD Programs Foundation of the Ministry of Education of China (Grant No. 200803351094)
文摘Let X 1, ..., X n be independent and identically distributed random variables and W n = W n (X 1, ..., X n ) be an estimator of parameter ?. Denote T n = (W n ? ? 0)/s n , where s n 2 is a variance estimator of W n . In this paper a general result on the limiting distributions of the non-central studentized statistic T n is given. Especially, when s n 2 is the jacknife estimate of variance, it is shown that the limit could be normal, a weighted χ 2 distribution, a stable distribution, or a mixture of normal and stable distribution. Applications to the power of the studentized U- and L- tests are also discussed.
基金the National Natural Science Foundation of China (Grant Nos. 10471136, 10671189)the Knowledge Innovation Program of the Chinese Academy of Sciences (Grant No. KJCX3-SYW-S02)
文摘Single index models are widely used in medicine, econometrics and some other fields. In this paper, we consider the inference of a change point problem in single index models. Based on density-weighted average derivative estimation (ADE) method, we propose a statistic to test whether a change point exists or not. The null distribution of the test statistic is obtained using a permutation technique. The permuted statistic is rigorously shown to have the same distribution in the limiting sense under both null and alternative hypotheses. After the null hypothesis of no change point is rejected, an ADE-based estimate of the change point is proposed under assumption that the change point is unique. A simulation study confirms the theoretical results.
基金supported by Shenzhen Sci-Tech Fund(Grant No.JCYJ 20170307110329106)the Natural Science Foundation of Guangdong Province of China(Grant No.2016A030313856)+1 种基金National Natural Science Foundation of China(Grant Nos.11701034,11601227,11871263 and 11671042)the University Grants Council of Hong Kong。
文摘In this study,we propose nonparametric testing for heteroscedasticity in nonlinear regression models based on pairwise distances between points in a sample.The test statistic can be formulated such that Ustatistic theory can be applied to it.Although the limiting null distribution of the statistic is complicated,we can derive a computationally feasible bootstrap approximation for such a distribution;the validity of the introduced bootstrap algorithm is proven.The test can detect any local alternatives that are different from the null at a nearly optimal rate in hypothesis testing.The convergence rate of this test statistic does not depend on the dimension of the covariates,which significantly alleviates the impact of dimensionality.We provide three simulation studies and a real-data example to evaluate the performance of the test and demonstrate its applications.
文摘Several tests for multivariate mean vector have been proposed in the recent literature.Generally,these tests are directly concerned with the mean vector of a high-dimensional distribution.The paper presents two new test procedures for testing mean vector in large dimension and small samples.We do not focus on the mean vector directly,which is a different framework from the existing choices.The first test procedure is based on the asymptotic distribution of the test statistic,where the dimension increases with the sample size.The second test procedure is based on the permutation distribution of the test statistic,where the sample size is fixed and the dimension grows to infinity.Simulations are carried out to examine the finite-sample performance of the tests and to compare them with some popular nonparametric tests available in the literature.
基金Research supported by the National Natural Science Foundation of Chinaa CRCG grant of the University of Hong Kong
文摘Let F_n be the Kaplan-Meier estimator of distribution function F. Let J(·) be a measureable real-valued function. In this paper, a U-statistic representation for the Kaplan-Meier L-estimator, T(F_n)=∫xJ( _n(x))d _n(x), is derived. Furthermore the representation is also used to establish a Berry-Essen inequality for T( _n).