Identifying genetic risk factors for Alzheimer's disease(AD)is an important research topic.To date,different endophenotypes,such as imaging-derived endophenotypes and proteomic expression-derived endophenotypes,ha...Identifying genetic risk factors for Alzheimer's disease(AD)is an important research topic.To date,different endophenotypes,such as imaging-derived endophenotypes and proteomic expression-derived endophenotypes,have shown the great value in uncovering risk genes compared to case-control studies.Biologically,a co-varying pattern of different omics-derived endophenotypes could result from the shared genetic basis.However,existing methods mainly focus on the effect of endophenotypes alone;the effect of cross-endophenotype(CEP)associations remains largely unexploited.In this study,we used both endophenotypes and their CEP associations of multi-omic data to identify genetic risk factors,and proposed two integrated multi-task sparse canonical correlation analysis(inMTSCCA)methods,i.e.,pairwise endophenotype correlationguided MTSCCA(pcMTSCCA)and high-order endophenotype correlation-guided MTSCCA(hocMTSCCA).pcMTSCCA employed pairwise correlations between magnetic resonance imaging(MRI)-derived,plasma-derived,and cerebrospinal fluid(CSF)-derived endophenotypes as an additional penalty.hocMTSCCA used high-order correlations among these multi-omic data for regularization.To figure out genetic risk factors at individual and group levels,as well as altered endophenotypic markers,we introduced sparsity-inducing penalties for both models.We compared pcMTSCCA and hocMTSCCA with three related methods on both simulation and real(consisting of neuroimaging data,proteomic analytes,and genetic data)datasets.The results showed that our methods obtained better or comparable canonical correlation coefficients(CCCs)and better feature subsets than benchmarks.Most importantly,the identified genetic loci and heterogeneous endophenotypic markers showed high relevance.Therefore,jointly using multi-omic endophenotypes and their CEP associations is promising to reveal genetic risk factors.展开更多
基金supported in part by the STI2030-Major Projects(Grant No.2022ZD0213700)the National Natural Science Foundation of China(Grant Nos.62136004,61973255,and 61936007)+1 种基金the Natural Science Basic Research Program of Shaanxi(Grant No.2020JM-142)the Innovation Foundation for Doctor Dissertation at Northwestern Polytechnical University,China(Grant No.CX2023062).
文摘Identifying genetic risk factors for Alzheimer's disease(AD)is an important research topic.To date,different endophenotypes,such as imaging-derived endophenotypes and proteomic expression-derived endophenotypes,have shown the great value in uncovering risk genes compared to case-control studies.Biologically,a co-varying pattern of different omics-derived endophenotypes could result from the shared genetic basis.However,existing methods mainly focus on the effect of endophenotypes alone;the effect of cross-endophenotype(CEP)associations remains largely unexploited.In this study,we used both endophenotypes and their CEP associations of multi-omic data to identify genetic risk factors,and proposed two integrated multi-task sparse canonical correlation analysis(inMTSCCA)methods,i.e.,pairwise endophenotype correlationguided MTSCCA(pcMTSCCA)and high-order endophenotype correlation-guided MTSCCA(hocMTSCCA).pcMTSCCA employed pairwise correlations between magnetic resonance imaging(MRI)-derived,plasma-derived,and cerebrospinal fluid(CSF)-derived endophenotypes as an additional penalty.hocMTSCCA used high-order correlations among these multi-omic data for regularization.To figure out genetic risk factors at individual and group levels,as well as altered endophenotypic markers,we introduced sparsity-inducing penalties for both models.We compared pcMTSCCA and hocMTSCCA with three related methods on both simulation and real(consisting of neuroimaging data,proteomic analytes,and genetic data)datasets.The results showed that our methods obtained better or comparable canonical correlation coefficients(CCCs)and better feature subsets than benchmarks.Most importantly,the identified genetic loci and heterogeneous endophenotypic markers showed high relevance.Therefore,jointly using multi-omic endophenotypes and their CEP associations is promising to reveal genetic risk factors.