The problem of estimating high-dimensional Gaussian graphical models has gained much attention in recent years. Most existing methods can be considered as one-step approaches, being either regression-based or likeliho...The problem of estimating high-dimensional Gaussian graphical models has gained much attention in recent years. Most existing methods can be considered as one-step approaches, being either regression-based or likelihood-based. In this paper, we propose a two-step method for estimating the high-dimensional Gaussian graphical model. Specifically, the first step serves as a screening step, in which many entries of the concentration matrix are identified as zeros and thus removed from further consideration. Then in the second step, we focus on the remaining entries of the concentration matrix and perform selection and estimation for nonzero entries of the concentration matrix. Since the dimension of the parameter space is effectively reduced by the screening step,the estimation accuracy of the estimated concentration matrix can be potentially improved. We show that the proposed method enjoys desirable asymptotic properties. Numerical comparisons of the proposed method with several existing methods indicate that the proposed method works well. We also apply the proposed method to a breast cancer microarray data set and obtain some biologically meaningful results.展开更多
Large-scale empirical data, the sample size and the dimension are high, often exhibit various characteristics. For example, the noise term follows unknown distributions or the model is very sparse that the number of c...Large-scale empirical data, the sample size and the dimension are high, often exhibit various characteristics. For example, the noise term follows unknown distributions or the model is very sparse that the number of critical variables is fixed while dimensionality grows with n. The authors consider the model selection problem of lasso for this kind of data. The authors investigate both theoretical guarantees and simulations, and show that the lasso is robust for various kinds of data.展开更多
基金National Natural Science Foundation of China (Grant No. 11671059)。
文摘The problem of estimating high-dimensional Gaussian graphical models has gained much attention in recent years. Most existing methods can be considered as one-step approaches, being either regression-based or likelihood-based. In this paper, we propose a two-step method for estimating the high-dimensional Gaussian graphical model. Specifically, the first step serves as a screening step, in which many entries of the concentration matrix are identified as zeros and thus removed from further consideration. Then in the second step, we focus on the remaining entries of the concentration matrix and perform selection and estimation for nonzero entries of the concentration matrix. Since the dimension of the parameter space is effectively reduced by the screening step,the estimation accuracy of the estimated concentration matrix can be potentially improved. We show that the proposed method enjoys desirable asymptotic properties. Numerical comparisons of the proposed method with several existing methods indicate that the proposed method works well. We also apply the proposed method to a breast cancer microarray data set and obtain some biologically meaningful results.
基金supported by the National Natural Science Foundation of China(No.11671059)
文摘Large-scale empirical data, the sample size and the dimension are high, often exhibit various characteristics. For example, the noise term follows unknown distributions or the model is very sparse that the number of critical variables is fixed while dimensionality grows with n. The authors consider the model selection problem of lasso for this kind of data. The authors investigate both theoretical guarantees and simulations, and show that the lasso is robust for various kinds of data.