In this paper, a semi-parametric regression model with an adaptive LASSO penalty imposed on both the linear and the nonlinear components of the mode is considered. The model is rewritten so that a signed-rank techniqu...In this paper, a semi-parametric regression model with an adaptive LASSO penalty imposed on both the linear and the nonlinear components of the mode is considered. The model is rewritten so that a signed-rank technique can be used for estimation. The nonlinear part consists of a covariate that enters the model nonlinearly via an unknown function that is estimated using Bsplines. The author shows that the resulting estimator is consistent under heavy-tailed distributions and asymptotic normality results are given. Monte Carlo simulations as well as practical applications are studied to assess the validity of the proposed estimation method.展开更多
This paper studies the reduced rank regression problem,which assumes a low-rank structure of the coefficient matrix,together with heavy-tailed noises.To address the heavy-tailed noise,we adopt the quantile loss functi...This paper studies the reduced rank regression problem,which assumes a low-rank structure of the coefficient matrix,together with heavy-tailed noises.To address the heavy-tailed noise,we adopt the quantile loss function instead of commonly used squared loss.However,the non-smooth quantile loss brings new challenges to both the computation and development of statistical properties,especially when the data are large in size and distributed across different machines.To this end,we first transform the response variable and reformulate the problem into a trace-norm regularized least-square problem,which greatly facilitates the computation.Based on this formulation,we further develop a distributed algorithm.Theoretically,we establish the convergence rate of the obtained estimator and the theoretical guarantee for rank recovery.The simulation analysis is provided to demonstrate the effectiveness of our method.展开更多
Recent technological advancements and developments have led to a dramatic increase in the amount of high-dimensional data and thus have increased the demand for proper and efficient multivariate regression methods.Num...Recent technological advancements and developments have led to a dramatic increase in the amount of high-dimensional data and thus have increased the demand for proper and efficient multivariate regression methods.Numerous traditional multivariate approaches such as principal component analysis have been used broadly in various research areas,including investment analysis,image identification,and population genetic structure analysis.However,these common approaches have the limitations of ignoring the correlations between responses and a low variable selection efficiency.Therefore,in this article,we introduce the reduced rank regression method and its extensions,sparse reduced rank regression and subspace assisted regression with row sparsity,which hold potential to meet the above demands and thus improve the interpretability of regression models.We conducted a simulation study to evaluate their performance and compared them with several other variable selection methods.For different application scenarios,we also provide selection suggestions based on predictive ability and variable selection accuracy.Finally,to demonstrate the practical value of these methods in the field of microbiome research,we applied our chosen method to real population-level microbiome data,the results of which validated our method.Our method extensions provide valuable guidelines for future omics research,especially with respect to multivariate regression,and could pave the way for novel discoveries in microbiome and related research fields.展开更多
This paper investigates the hypothesis test of the parametric component in partial functional linear regression models.Based on a rank score function,the authors develop a rank test using functional principal componen...This paper investigates the hypothesis test of the parametric component in partial functional linear regression models.Based on a rank score function,the authors develop a rank test using functional principal component analysis,and establish the asymptotic properties of the resulting test under null and local alternative hypotheses.A simulation study shows that the proposed test procedure has good size and power with finite sample sizes.The authors also present an illustration through fitting the Berkeley Growth Data and testing the effect of gender on the height of kids.展开更多
Due to the direct statistical interpretation,censored linear regression offers a valuable complement to the Cox proportional hazards regression in survival analysis.We propose a rank-based high-dimensional inference f...Due to the direct statistical interpretation,censored linear regression offers a valuable complement to the Cox proportional hazards regression in survival analysis.We propose a rank-based high-dimensional inference for censored linear regression without imposing any moment condition on the model error.We develop a theory of the high-dimensional U-statistic,circumvent challenges stemming from the non-smoothness of the loss function,and establish the convergence rate of the regularized estimator and the asymptotic normality of the resulting de-biased estimator as well as the consistency of the asymptotic variance estimation.As censoring can be viewed as a way of trimming,it strengthens the robustness of the rank-based high-dimensional inference,particularly for the heavy-tailed model error or the outlier in the presence of the response.We evaluate the finite-sample performance of the proposed method via extensive simulation studies and demonstrate its utility by applying it to a subcohort study from The Cancer Genome Atlas(TCGA).展开更多
We consider the estimation of three-dimensional ROC surfaces for continuous tests given covariates.Three way ROC analysis is important in our motivating example where patients with Alzheimer's disease are usually ...We consider the estimation of three-dimensional ROC surfaces for continuous tests given covariates.Three way ROC analysis is important in our motivating example where patients with Alzheimer's disease are usually classified into three categories and should receive different category-specific medical treatment.There has been no discussion on how covariates affect the three way ROC analysis.We propose a regression framework induced from the relationship between test results and covariates.We consider several practical cases and the corresponding inference procedures.Simulations are conducted to validate our methodology.The application on the motivating example illustrates clearly the age and sex effects on the accuracy for Mini-Mental State Examination of Alzheimer's disease.展开更多
The objective of this paper is to quantify the complexity of rank and nuclear norm constrained methods for low rank matrix estimation problems. Specifically, we derive analytic forms of the degrees of freedom for thes...The objective of this paper is to quantify the complexity of rank and nuclear norm constrained methods for low rank matrix estimation problems. Specifically, we derive analytic forms of the degrees of freedom for these types of estimators in several common settings. These results provide efficient ways of comparing different estimators and eliciting tuning parameters. Moreover, our analyses reveal new insights on the behavior of these low rank matrix estimators. These observations are of great theoretical and practical importance. In particular, they suggest that, contrary to conventional wisdom, for rank constrained estimators the total number of free parameters underestimates the degrees of freedom, whereas for nuclear norm penalization, it overestimates the degrees of freedom. In addition, when using most model selection criteria to choose the tuning parameter for nuclear norm penalization, it oftentimes suffices to entertain a finite number of candidates as opposed to a continuum of choices. Numerical examples are also presented to illustrate the practical implications of our results.展开更多
文摘In this paper, a semi-parametric regression model with an adaptive LASSO penalty imposed on both the linear and the nonlinear components of the mode is considered. The model is rewritten so that a signed-rank technique can be used for estimation. The nonlinear part consists of a covariate that enters the model nonlinearly via an unknown function that is estimated using Bsplines. The author shows that the resulting estimator is consistent under heavy-tailed distributions and asymptotic normality results are given. Monte Carlo simulations as well as practical applications are studied to assess the validity of the proposed estimation method.
基金supported by National Basic Research Program of China(973 Program)(Grant No.2018AAA0100704)National Natural Science Foundation of China(Grant Nos.11825104 and 11690013)+3 种基金Youth Talent Support Program and Australian Research Councilsupported by National Natural Science Foundation of China(Grant No.12001109)Shanghai Sailing Program(Grant No.19YF1402800)the Science and Technology Commission of Shanghai Municipality(Grant No.20dz1200600)。
文摘This paper studies the reduced rank regression problem,which assumes a low-rank structure of the coefficient matrix,together with heavy-tailed noises.To address the heavy-tailed noise,we adopt the quantile loss function instead of commonly used squared loss.However,the non-smooth quantile loss brings new challenges to both the computation and development of statistical properties,especially when the data are large in size and distributed across different machines.To this end,we first transform the response variable and reformulate the problem into a trace-norm regularized least-square problem,which greatly facilitates the computation.Based on this formulation,we further develop a distributed algorithm.Theoretically,we establish the convergence rate of the obtained estimator and the theoretical guarantee for rank recovery.The simulation analysis is provided to demonstrate the effectiveness of our method.
基金the National Key Research and Development Program of China(2018YFC2000500)the Strategic Priority Research Program of the Chinese Academy of Sciences(XDB29020000)+1 种基金the National Natural Science Foundation of China(31771481 and 91857101)the Key Research Program of the Chinese Academy of Sciences(KFZD-SW-219),“China Microbiome Initiative.”。
文摘Recent technological advancements and developments have led to a dramatic increase in the amount of high-dimensional data and thus have increased the demand for proper and efficient multivariate regression methods.Numerous traditional multivariate approaches such as principal component analysis have been used broadly in various research areas,including investment analysis,image identification,and population genetic structure analysis.However,these common approaches have the limitations of ignoring the correlations between responses and a low variable selection efficiency.Therefore,in this article,we introduce the reduced rank regression method and its extensions,sparse reduced rank regression and subspace assisted regression with row sparsity,which hold potential to meet the above demands and thus improve the interpretability of regression models.We conducted a simulation study to evaluate their performance and compared them with several other variable selection methods.For different application scenarios,we also provide selection suggestions based on predictive ability and variable selection accuracy.Finally,to demonstrate the practical value of these methods in the field of microbiome research,we applied our chosen method to real population-level microbiome data,the results of which validated our method.Our method extensions provide valuable guidelines for future omics research,especially with respect to multivariate regression,and could pave the way for novel discoveries in microbiome and related research fields.
基金supported by the National Natural Science Foundation of China under Grant Nos.1177103211571340 and 11701020+1 种基金the Science and Technology Project of Beijing Municipal Education Commission under Grant Nos.KM201710005032 and KM201910005015the International Research Cooperation Seed Fund of Beijing University of Technology under Grant No.006000514118553。
文摘This paper investigates the hypothesis test of the parametric component in partial functional linear regression models.Based on a rank score function,the authors develop a rank test using functional principal component analysis,and establish the asymptotic properties of the resulting test under null and local alternative hypotheses.A simulation study shows that the proposed test procedure has good size and power with finite sample sizes.The authors also present an illustration through fitting the Berkeley Growth Data and testing the effect of gender on the height of kids.
基金supported by National Natural Science Foundation of China(Grant No.12071483)。
文摘Due to the direct statistical interpretation,censored linear regression offers a valuable complement to the Cox proportional hazards regression in survival analysis.We propose a rank-based high-dimensional inference for censored linear regression without imposing any moment condition on the model error.We develop a theory of the high-dimensional U-statistic,circumvent challenges stemming from the non-smoothness of the loss function,and establish the convergence rate of the regularized estimator and the asymptotic normality of the resulting de-biased estimator as well as the consistency of the asymptotic variance estimation.As censoring can be viewed as a way of trimming,it strengthens the robustness of the rank-based high-dimensional inference,particularly for the heavy-tailed model error or the outlier in the presence of the response.We evaluate the finite-sample performance of the proposed method via extensive simulation studies and demonstrate its utility by applying it to a subcohort study from The Cancer Genome Atlas(TCGA).
基金support provided by the National Alzheimer's Coordinating Center(NACC)supported by National University of Singapore Academic Research Funding(Grant No.R-155-000-109-112)+2 种基金a CBRG grant from the National Medical Research Council in Singapore,NACC(Grant No.U01AG16976)the National Institute of Health(Grant No.R01EB005829)National Natural Science Foundation of China(Grant No.30728019)
文摘We consider the estimation of three-dimensional ROC surfaces for continuous tests given covariates.Three way ROC analysis is important in our motivating example where patients with Alzheimer's disease are usually classified into three categories and should receive different category-specific medical treatment.There has been no discussion on how covariates affect the three way ROC analysis.We propose a regression framework induced from the relationship between test results and covariates.We consider several practical cases and the corresponding inference procedures.Simulations are conducted to validate our methodology.The application on the motivating example illustrates clearly the age and sex effects on the accuracy for Mini-Mental State Examination of Alzheimer's disease.
基金supported by National Science Foundation of USA (Grant No. DMS1265202)National Institutes of Health of USA (Grant No. 1-U54AI117924-01)
文摘The objective of this paper is to quantify the complexity of rank and nuclear norm constrained methods for low rank matrix estimation problems. Specifically, we derive analytic forms of the degrees of freedom for these types of estimators in several common settings. These results provide efficient ways of comparing different estimators and eliciting tuning parameters. Moreover, our analyses reveal new insights on the behavior of these low rank matrix estimators. These observations are of great theoretical and practical importance. In particular, they suggest that, contrary to conventional wisdom, for rank constrained estimators the total number of free parameters underestimates the degrees of freedom, whereas for nuclear norm penalization, it overestimates the degrees of freedom. In addition, when using most model selection criteria to choose the tuning parameter for nuclear norm penalization, it oftentimes suffices to entertain a finite number of candidates as opposed to a continuum of choices. Numerical examples are also presented to illustrate the practical implications of our results.