The estimation of covariance matrices is very important in many fields, such as statistics. In real applications, data are frequently influenced by high dimensions and noise. However, most relevant studies are based o...The estimation of covariance matrices is very important in many fields, such as statistics. In real applications, data are frequently influenced by high dimensions and noise. However, most relevant studies are based on complete data. This paper studies the optimal estimation of high-dimensional covariance matrices based on missing and noisy sample under the norm. First, the model with sub-Gaussian additive noise is presented. The generalized sample covariance is then modified to define a hard thresholding estimator , and the minimax upper bound is derived. After that, the minimax lower bound is derived, and it is concluded that the estimator presented in this article is rate-optimal. Finally, numerical simulation analysis is performed. The result shows that for missing samples with sub-Gaussian noise, if the true covariance matrix is sparse, the hard thresholding estimator outperforms the traditional estimate method.展开更多
This paper addresses the issue of testing sphericity and identity of high-dimensional population covariance matrix when the data dimension exceeds the sample size.The central limit theorem of the first four moments of...This paper addresses the issue of testing sphericity and identity of high-dimensional population covariance matrix when the data dimension exceeds the sample size.The central limit theorem of the first four moments of eigenvalues of sample covariance matrix is derived using random matrix theory for generally distributed populations.Further,some desirable asymptotic properties of the proposed test statistics are provided under the null hypothesis as data dimension and sample size both tend to infinity.Simulations show that the proposed tests have a greater power than existing methods for the spiked covariance model.展开更多
The testing covariance equality is of importance in many areas of statistical analysis,such as microarray analysis and quality control.Conventional tests for the finite-dimensional covariance do not apply to high-dime...The testing covariance equality is of importance in many areas of statistical analysis,such as microarray analysis and quality control.Conventional tests for the finite-dimensional covariance do not apply to high-dimensional data in general,and tests for the high-dimensional covariance in the literature usually depend on some special structure of the matrix and whether the dimension diverges.In this paper,we propose a jackknife empirical likelihood method to test the equality of covariance matrices.The asymptotic distribution of the new test is regardless of the divergent or fixed dimension.Simulation studies show that the new test has a very stable size with respect to the dimension and it is also more powerful than the test proposed by Schott(2007)and studied by Srivastava and Yanagihara(2010).Furthermore,we illustrate the method using a breast cancer dataset.展开更多
Detecting differential expression of genes in genom research(e.g.,2019-nCoV)is not uncommon,due to the cost only small sample is employed to estimate a large number of variances(or their inverse)of variables simultane...Detecting differential expression of genes in genom research(e.g.,2019-nCoV)is not uncommon,due to the cost only small sample is employed to estimate a large number of variances(or their inverse)of variables simultaneously.However,the commonly used approaches perform unreliable.Borrowing information across different variables or priori information of variables,shrinkage estimation approaches are proposed and some optimal shrinkage estimators are obtained in the sense of asymptotic.In this paper,we focus on the setting of small sample and a likelihood-unbiased estimator for power of variances is given under the assumption that the variances are chi-squared distribution.Simulation reports show that the likelihood-unbiased estimators for variances and their inverse perform very well.In addition,application comparison and real data analysis indicate that the proposed estimator also works well.展开更多
文摘The estimation of covariance matrices is very important in many fields, such as statistics. In real applications, data are frequently influenced by high dimensions and noise. However, most relevant studies are based on complete data. This paper studies the optimal estimation of high-dimensional covariance matrices based on missing and noisy sample under the norm. First, the model with sub-Gaussian additive noise is presented. The generalized sample covariance is then modified to define a hard thresholding estimator , and the minimax upper bound is derived. After that, the minimax lower bound is derived, and it is concluded that the estimator presented in this article is rate-optimal. Finally, numerical simulation analysis is performed. The result shows that for missing samples with sub-Gaussian noise, if the true covariance matrix is sparse, the hard thresholding estimator outperforms the traditional estimate method.
基金supported by the National Natural Science Foundation of China(Nos.61374027,11871357)the Sichuan Science and Technology Program(Nos.2019YJ0122)。
文摘This paper addresses the issue of testing sphericity and identity of high-dimensional population covariance matrix when the data dimension exceeds the sample size.The central limit theorem of the first four moments of eigenvalues of sample covariance matrix is derived using random matrix theory for generally distributed populations.Further,some desirable asymptotic properties of the proposed test statistics are provided under the null hypothesis as data dimension and sample size both tend to infinity.Simulations show that the proposed tests have a greater power than existing methods for the spiked covariance model.
基金supported by the Simons Foundation,National Natural Science Foundation of China(Grant Nos.11771390 and 11371318)Zhejiang Provincial Natural Science Foundation of China(Grant No.LR16A010001)+1 种基金the University of Sydney and Zhejiang University Partnership Collaboration Awardsthe Fundamental Research Funds for the Central Universities.
文摘The testing covariance equality is of importance in many areas of statistical analysis,such as microarray analysis and quality control.Conventional tests for the finite-dimensional covariance do not apply to high-dimensional data in general,and tests for the high-dimensional covariance in the literature usually depend on some special structure of the matrix and whether the dimension diverges.In this paper,we propose a jackknife empirical likelihood method to test the equality of covariance matrices.The asymptotic distribution of the new test is regardless of the divergent or fixed dimension.Simulation studies show that the new test has a very stable size with respect to the dimension and it is also more powerful than the test proposed by Schott(2007)and studied by Srivastava and Yanagihara(2010).Furthermore,we illustrate the method using a breast cancer dataset.
基金Supported by the National Natural Science Foundation of China(11971433)First Class Discipline of Zhejiang-A(Zhejiang Gongshang University-Statistics)Hunan Soft Science Research Project(2012ZK3064)
文摘Detecting differential expression of genes in genom research(e.g.,2019-nCoV)is not uncommon,due to the cost only small sample is employed to estimate a large number of variances(or their inverse)of variables simultaneously.However,the commonly used approaches perform unreliable.Borrowing information across different variables or priori information of variables,shrinkage estimation approaches are proposed and some optimal shrinkage estimators are obtained in the sense of asymptotic.In this paper,we focus on the setting of small sample and a likelihood-unbiased estimator for power of variances is given under the assumption that the variances are chi-squared distribution.Simulation reports show that the likelihood-unbiased estimators for variances and their inverse perform very well.In addition,application comparison and real data analysis indicate that the proposed estimator also works well.