This paper studies the reduced rank regression problem,which assumes a low-rank structure of the coefficient matrix,together with heavy-tailed noises.To address the heavy-tailed noise,we adopt the quantile loss functi...This paper studies the reduced rank regression problem,which assumes a low-rank structure of the coefficient matrix,together with heavy-tailed noises.To address the heavy-tailed noise,we adopt the quantile loss function instead of commonly used squared loss.However,the non-smooth quantile loss brings new challenges to both the computation and development of statistical properties,especially when the data are large in size and distributed across different machines.To this end,we first transform the response variable and reformulate the problem into a trace-norm regularized least-square problem,which greatly facilitates the computation.Based on this formulation,we further develop a distributed algorithm.Theoretically,we establish the convergence rate of the obtained estimator and the theoretical guarantee for rank recovery.The simulation analysis is provided to demonstrate the effectiveness of our method.展开更多
Recent technological advancements and developments have led to a dramatic increase in the amount of high-dimensional data and thus have increased the demand for proper and efficient multivariate regression methods.Num...Recent technological advancements and developments have led to a dramatic increase in the amount of high-dimensional data and thus have increased the demand for proper and efficient multivariate regression methods.Numerous traditional multivariate approaches such as principal component analysis have been used broadly in various research areas,including investment analysis,image identification,and population genetic structure analysis.However,these common approaches have the limitations of ignoring the correlations between responses and a low variable selection efficiency.Therefore,in this article,we introduce the reduced rank regression method and its extensions,sparse reduced rank regression and subspace assisted regression with row sparsity,which hold potential to meet the above demands and thus improve the interpretability of regression models.We conducted a simulation study to evaluate their performance and compared them with several other variable selection methods.For different application scenarios,we also provide selection suggestions based on predictive ability and variable selection accuracy.Finally,to demonstrate the practical value of these methods in the field of microbiome research,we applied our chosen method to real population-level microbiome data,the results of which validated our method.Our method extensions provide valuable guidelines for future omics research,especially with respect to multivariate regression,and could pave the way for novel discoveries in microbiome and related research fields.展开更多
The objective of this paper is to quantify the complexity of rank and nuclear norm constrained methods for low rank matrix estimation problems. Specifically, we derive analytic forms of the degrees of freedom for thes...The objective of this paper is to quantify the complexity of rank and nuclear norm constrained methods for low rank matrix estimation problems. Specifically, we derive analytic forms of the degrees of freedom for these types of estimators in several common settings. These results provide efficient ways of comparing different estimators and eliciting tuning parameters. Moreover, our analyses reveal new insights on the behavior of these low rank matrix estimators. These observations are of great theoretical and practical importance. In particular, they suggest that, contrary to conventional wisdom, for rank constrained estimators the total number of free parameters underestimates the degrees of freedom, whereas for nuclear norm penalization, it overestimates the degrees of freedom. In addition, when using most model selection criteria to choose the tuning parameter for nuclear norm penalization, it oftentimes suffices to entertain a finite number of candidates as opposed to a continuum of choices. Numerical examples are also presented to illustrate the practical implications of our results.展开更多
This paper studies a partially nonstationary vector autoregressive(VAR)model with vector GARCH noises.We study the full rank and the reduced rank quasi-maximum likelihood estimators(QMLE)of parameters in the model.It ...This paper studies a partially nonstationary vector autoregressive(VAR)model with vector GARCH noises.We study the full rank and the reduced rank quasi-maximum likelihood estimators(QMLE)of parameters in the model.It is shown that both QMLE of long-run parameters asymptotically converge to a functional of two correlated vector Brownian motions.Based these,the likelihood ratio(LR)test statistic for cointegration rank is shown to be a functional of the standard Brownian motion and normal vector,asymptotically.As far as we know,our test is new in the literature.The critical values of the LR test are simulated via the Monte Carlo method.The performance of this test in finite samples is examined through Monte Carlo experiments.We apply our approach to an empirical example of three interest rates.展开更多
基金supported by National Basic Research Program of China(973 Program)(Grant No.2018AAA0100704)National Natural Science Foundation of China(Grant Nos.11825104 and 11690013)+3 种基金Youth Talent Support Program and Australian Research Councilsupported by National Natural Science Foundation of China(Grant No.12001109)Shanghai Sailing Program(Grant No.19YF1402800)the Science and Technology Commission of Shanghai Municipality(Grant No.20dz1200600)。
文摘This paper studies the reduced rank regression problem,which assumes a low-rank structure of the coefficient matrix,together with heavy-tailed noises.To address the heavy-tailed noise,we adopt the quantile loss function instead of commonly used squared loss.However,the non-smooth quantile loss brings new challenges to both the computation and development of statistical properties,especially when the data are large in size and distributed across different machines.To this end,we first transform the response variable and reformulate the problem into a trace-norm regularized least-square problem,which greatly facilitates the computation.Based on this formulation,we further develop a distributed algorithm.Theoretically,we establish the convergence rate of the obtained estimator and the theoretical guarantee for rank recovery.The simulation analysis is provided to demonstrate the effectiveness of our method.
基金the National Key Research and Development Program of China(2018YFC2000500)the Strategic Priority Research Program of the Chinese Academy of Sciences(XDB29020000)+1 种基金the National Natural Science Foundation of China(31771481 and 91857101)the Key Research Program of the Chinese Academy of Sciences(KFZD-SW-219),“China Microbiome Initiative.”。
文摘Recent technological advancements and developments have led to a dramatic increase in the amount of high-dimensional data and thus have increased the demand for proper and efficient multivariate regression methods.Numerous traditional multivariate approaches such as principal component analysis have been used broadly in various research areas,including investment analysis,image identification,and population genetic structure analysis.However,these common approaches have the limitations of ignoring the correlations between responses and a low variable selection efficiency.Therefore,in this article,we introduce the reduced rank regression method and its extensions,sparse reduced rank regression and subspace assisted regression with row sparsity,which hold potential to meet the above demands and thus improve the interpretability of regression models.We conducted a simulation study to evaluate their performance and compared them with several other variable selection methods.For different application scenarios,we also provide selection suggestions based on predictive ability and variable selection accuracy.Finally,to demonstrate the practical value of these methods in the field of microbiome research,we applied our chosen method to real population-level microbiome data,the results of which validated our method.Our method extensions provide valuable guidelines for future omics research,especially with respect to multivariate regression,and could pave the way for novel discoveries in microbiome and related research fields.
基金supported by National Science Foundation of USA (Grant No. DMS1265202)National Institutes of Health of USA (Grant No. 1-U54AI117924-01)
文摘The objective of this paper is to quantify the complexity of rank and nuclear norm constrained methods for low rank matrix estimation problems. Specifically, we derive analytic forms of the degrees of freedom for these types of estimators in several common settings. These results provide efficient ways of comparing different estimators and eliciting tuning parameters. Moreover, our analyses reveal new insights on the behavior of these low rank matrix estimators. These observations are of great theoretical and practical importance. In particular, they suggest that, contrary to conventional wisdom, for rank constrained estimators the total number of free parameters underestimates the degrees of freedom, whereas for nuclear norm penalization, it overestimates the degrees of freedom. In addition, when using most model selection criteria to choose the tuning parameter for nuclear norm penalization, it oftentimes suffices to entertain a finite number of candidates as opposed to a continuum of choices. Numerical examples are also presented to illustrate the practical implications of our results.
文摘This paper studies a partially nonstationary vector autoregressive(VAR)model with vector GARCH noises.We study the full rank and the reduced rank quasi-maximum likelihood estimators(QMLE)of parameters in the model.It is shown that both QMLE of long-run parameters asymptotically converge to a functional of two correlated vector Brownian motions.Based these,the likelihood ratio(LR)test statistic for cointegration rank is shown to be a functional of the standard Brownian motion and normal vector,asymptotically.As far as we know,our test is new in the literature.The critical values of the LR test are simulated via the Monte Carlo method.The performance of this test in finite samples is examined through Monte Carlo experiments.We apply our approach to an empirical example of three interest rates.