The Indian buffet process(IBP)and phylogenetic Indian buffet process(pIBP)can be used as prior models to infer latent features in a data set.The theoretical properties of these models are under-explored,however,especi...The Indian buffet process(IBP)and phylogenetic Indian buffet process(pIBP)can be used as prior models to infer latent features in a data set.The theoretical properties of these models are under-explored,however,especially in high dimensional settings.In this paper,we show that under mild sparsity condition,the posterior distribution of the latent feature matrix,generated via IBP or pIBP priors,converges to the true latent feature matrix asymptotically.We derive the posterior convergence rate,referred to as the contraction rate.We show that the convergence results remain valid even when the dimensionality of the latent feature matrix increases with the sample size,therefore making the posterior inference valid in high dimensional settings.We demonstrate the theoretical results using computer simulation,in which the parallel-tempering Markov chain Monte Carlo method is applied to overcome computational hurdles.The practical utility of the derived properties is demonstrated by inferring the latent features in a reverse phase protein arrays(RPPA)dataset under the IBP prior model.展开更多
Estimation of large covariance matrices is of great importance in multivariate analysis.The modified Cholesky decomposition is a commonly used technique in covariance matrix estimation given a specific order of variab...Estimation of large covariance matrices is of great importance in multivariate analysis.The modified Cholesky decomposition is a commonly used technique in covariance matrix estimation given a specific order of variables.However,information on the order of variables is often unknown,or cannot be reasonably assumed in practice.In this work,we propose a Choleskybased model averaging approach of covariance matrix estimation for high dimensional datawith proper regularisation imposed on the Cholesky factor matrix.The proposed method not only guarantees the positive definiteness of the covariance matrix estimate,but also is applicable in general situations without the order of variables being pre-specified.Numerical simulations are conducted to evaluate the performance of the proposed method in comparison with several other covariance matrix estimates.The advantage of our proposed method is further illustrated by a real case study of equity portfolio allocation.展开更多
With increasing appearances of high-dimensional data over the past two decades,variable selections through frequentist likelihood penalisation approaches and their Bayesian counterparts becomes a popular yet challengi...With increasing appearances of high-dimensional data over the past two decades,variable selections through frequentist likelihood penalisation approaches and their Bayesian counterparts becomes a popular yet challenging research area in statistics.Under a normal linear model with shrinkage priors,we propose a benchmark variable approach for Bayesian variable selection.The benchmark variable serves as a standard and helps us to assess and rank the importance of each covariate based on the posterior distribution of the corresponding regression coefficient.For a sparse Bayesian analysis,we use the benchmark in conjunction with a modified BIC.We also develop our benchmark approach to accommodate models with covariates exhibiting group structures.Two simulation studies are carried out to assess and compare the performances amongthe proposed approach and other methods.Three real datasets are also analysed by using these methods for illustration.展开更多
Smoothing spline is a popular method in non-parametric function estimation.For the analysis of data from real applications,specific shapes on the estimated function are often required to ensure the estimated function ...Smoothing spline is a popular method in non-parametric function estimation.For the analysis of data from real applications,specific shapes on the estimated function are often required to ensure the estimated function undeviating from the domain knowledge.In this work,we focus on constructing the exact shape constrained smoothing spline with efficient estimation.The‘exact’here is referred as to impose the shape constraint on an infinite set such as an interval in one-dimensional case.Thus the estimation becomes a so-called semi-infinite optimisation problem with an infinite number of constraints.The proposed method is able to establish a sufficient and necessary condition for transforming the exact shape constraints to a finite number of constraints,leading to efficient estimation of the shape constrained functions.The performance of the proposed methods is evaluated by both simulation and real case studies.展开更多
文摘The Indian buffet process(IBP)and phylogenetic Indian buffet process(pIBP)can be used as prior models to infer latent features in a data set.The theoretical properties of these models are under-explored,however,especially in high dimensional settings.In this paper,we show that under mild sparsity condition,the posterior distribution of the latent feature matrix,generated via IBP or pIBP priors,converges to the true latent feature matrix asymptotically.We derive the posterior convergence rate,referred to as the contraction rate.We show that the convergence results remain valid even when the dimensionality of the latent feature matrix increases with the sample size,therefore making the posterior inference valid in high dimensional settings.We demonstrate the theoretical results using computer simulation,in which the parallel-tempering Markov chain Monte Carlo method is applied to overcome computational hurdles.The practical utility of the derived properties is demonstrated by inferring the latent features in a reverse phase protein arrays(RPPA)dataset under the IBP prior model.
基金National Science of Foundation of China[grant number NSFC-71531004]NNSF.
文摘Estimation of large covariance matrices is of great importance in multivariate analysis.The modified Cholesky decomposition is a commonly used technique in covariance matrix estimation given a specific order of variables.However,information on the order of variables is often unknown,or cannot be reasonably assumed in practice.In this work,we propose a Choleskybased model averaging approach of covariance matrix estimation for high dimensional datawith proper regularisation imposed on the Cholesky factor matrix.The proposed method not only guarantees the positive definiteness of the covariance matrix estimate,but also is applicable in general situations without the order of variables being pre-specified.Numerical simulations are conducted to evaluate the performance of the proposed method in comparison with several other covariance matrix estimates.The advantage of our proposed method is further illustrated by a real case study of equity portfolio allocation.
基金supported by the National Natural Science Foundation of China[grant number 11831008]the U.S.National Science Foundation[grant numbers DMS-1612873 and DMS-1914411].
文摘With increasing appearances of high-dimensional data over the past two decades,variable selections through frequentist likelihood penalisation approaches and their Bayesian counterparts becomes a popular yet challenging research area in statistics.Under a normal linear model with shrinkage priors,we propose a benchmark variable approach for Bayesian variable selection.The benchmark variable serves as a standard and helps us to assess and rank the importance of each covariate based on the posterior distribution of the corresponding regression coefficient.For a sparse Bayesian analysis,we use the benchmark in conjunction with a modified BIC.We also develop our benchmark approach to accommodate models with covariates exhibiting group structures.Two simulation studies are carried out to assess and compare the performances amongthe proposed approach and other methods.Three real datasets are also analysed by using these methods for illustration.
基金supported by National Science Foundation[1634867].
文摘Smoothing spline is a popular method in non-parametric function estimation.For the analysis of data from real applications,specific shapes on the estimated function are often required to ensure the estimated function undeviating from the domain knowledge.In this work,we focus on constructing the exact shape constrained smoothing spline with efficient estimation.The‘exact’here is referred as to impose the shape constraint on an infinite set such as an interval in one-dimensional case.Thus the estimation becomes a so-called semi-infinite optimisation problem with an infinite number of constraints.The proposed method is able to establish a sufficient and necessary condition for transforming the exact shape constraints to a finite number of constraints,leading to efficient estimation of the shape constrained functions.The performance of the proposed methods is evaluated by both simulation and real case studies.