A method was proposed for the detection of outliers and influential observations in the framework of a mixed linear model, prior to the quantitative trait locus (QTL) mapping analysis. We investigated the impact of ou...A method was proposed for the detection of outliers and influential observations in the framework of a mixed linear model, prior to the quantitative trait locus (QTL) mapping analysis. We investigated the impact of outliers on QTL mapping for complex traits in a mouse BXD population, and observed that the dropping of outliers could provide the evidence of additional QTL and epistatic loci affecting the 1stBrain-OB and the 2ndBrain-OB in a cross of the abovementioned population. The results could also reveal a remarkable increase in estimating heritabilities of QTL in the absence of outliers. In addition, simulations were conducted to investigate the detection powers and false discovery rates (FDRs) of QTLs in the presence and absence of outliers. The results suggested that the presence of a small proportion of outliers could increase the FDR and hence decrease the detection power of QTLs. A drastic increase could be obtained in the estimates of standard errors for position, additive and additive× environment interaction effects of QTLs in the presence of outliers.展开更多
Complex traits are the features whose properties are determined by both genetic and environmental factors. Generally, complex traits include the classical quantitative traits with continuous distribution, the binary o...Complex traits are the features whose properties are determined by both genetic and environmental factors. Generally, complex traits include the classical quantitative traits with continuous distribution, the binary or categorical traits with discrete distribution controlled by polygene and other traits that cannot be measured exactly, such as behavior and psychology. Most human complex diseases and most economically important traits in plants and animals belong to the category. Understanding the molecular basis of complex traits plays a vital role in the genetic improvement of plant and animal breeding. In this article, the conception and research background of complex traits were summarized, and the strategies, methods and the great progress that had been made in dissecting genetic basis of complex traits were reviewed. The challenges and possible developments in future researches were also discussed.展开更多
Despite considerable advances in extracting crucial insights from bio-omics data to unravel the intricate mechanisms underlying complex traits,the absence of a universal multi-modal computational tool with robust inte...Despite considerable advances in extracting crucial insights from bio-omics data to unravel the intricate mechanisms underlying complex traits,the absence of a universal multi-modal computational tool with robust interpretability for accurate phenotype prediction and identification of trait-associated genes remains a challenge.This study introduces the dual-extraction modeling(DEM)approach,a multi-modal deep-learning architecture designed to extract representative features from heterogeneous omics datasets,enabling the prediction of complex trait phenotypes.Through comprehensive benchmarking experiments,we demonstrate the efficacy of DEM in classification and regression prediction of complex traits.DEM consistently exhibits superior accuracy,robustness,generalizability,and flexibility.Notably,we establish its effectiveness in predicting pleiotropic genes that influence both flowering time and rosette leaf number,underscoring its commendable interpretability.In addition,we have developed user-friendly software to facilitate seamless utilization of DEM’s functions.In summary,this study presents a state-of-the-art approach with the ability to effectively predict qualitative and quantitative traits and identify functional genes,confirming its potential as a valuable tool for exploring the genetic basis of complex traits.展开更多
Alternative splicing exists in most multi-exonic genes,and exploring these complex alternative splicing events and their resultant isoform expressions is essential.However,it has become conventional that RNA sequencin...Alternative splicing exists in most multi-exonic genes,and exploring these complex alternative splicing events and their resultant isoform expressions is essential.However,it has become conventional that RNA sequencing results have often been summarized into gene-level expression counts mainly due to the multiple ambiguous mapping of reads at highly similar regions.Transcript-level quantification and interpretation are often overlooked,and biological interpretations are often deduced based on combined transcript information at the gene level.Here,for the most variable tissue of alternative splicing,the brain,we estimate isoform expressions in 1,191 samples collected by the Genotype-Tissue Expression(GTEx)Consortium using a powerful method that we previously developed.We perform genome-wide association scans on the isoform ratios per gene and identify isoform-ratio quantitative trait loci(irQTL),which could not be detected by studying gene-level expressions alone.By analyzing the genetic architecture of the irQTL,we show that isoform ratios regulate edu-cational attainment via multiple tissues including the frontal cortex(BA9),cortex,cervical spinal cord,and hippocampus.These tissues are also associated with different neuro-related traits,including Alzheimer’s or dementia,mood swings,sleep duration,alcohol intake,intelligence,anxiety or depression,etc.Mendelian randomization(MR)analysis revealed 1,139 pairs of isoforms and neuro-related traits with plausible causal relationships,showing much stronger causal effects than on general diseases measured in the UK Biobank(UKB).Our results highlight essential transcript-level biomarkers in the human brain for neuro-related complex traits and diseases,which could be missed by merely investigating overall gene expressions.展开更多
It has long been assumed that most parts of a genome and most genetic variations or SNPs are non-functional with regard to reproductive fitness.However,the collective effects of SNPs have yet to be examined by experim...It has long been assumed that most parts of a genome and most genetic variations or SNPs are non-functional with regard to reproductive fitness.However,the collective effects of SNPs have yet to be examined by experimental science.We here developed a novel approach to examine the relationship between traits and the total amount of SNPs in panels of genetic reference populations.We identified the minor alleles(MAs)in each panel and the MA content(MAC)that each inbred strain carried for a set of SNPs with genotypes determined in these panels.MAC was nearly linearly linked to quantitative variations in numerous traits in model organisms,including life span,tumor susceptibility,learning and memory,sensitivity to alcohol and anti-psychotic drugs,and two correlated traits poor reproductive fitness and strong immunity.These results suggest that the collective effects of SNPs are functional and do affect reproductive fitness.展开更多
Phenotypic plasticity is the ability of a given genotype to produce multiple phenotypes in response to changing environmental conditions.Understanding the genetic basis of phenotypic plasticity and establishing a pred...Phenotypic plasticity is the ability of a given genotype to produce multiple phenotypes in response to changing environmental conditions.Understanding the genetic basis of phenotypic plasticity and establishing a predictive model is highly relevant to future agriculture under a changing climate.Here we report findings on the genetic basis of phenotypic plasticity for 23 complex traits using a diverse maize population planted at five sites with distinct environmental conditions.We found that latituderelated environmental factors were the main drivers of across-site variation in flowering time traits but not in plant architecture or yield traits.For the 23 traits,we detected 109 quantitative trait loci(QTLs),29 for mean values,66 for plasticity,and 14 for both parameters,and 80%of the QTLs interacted with latitude.The effects of several QTLs changed in magnitude or sign,driving variation in phenotypic plasticity.We experimentally validated one plastic gene,ZmTPS14.1,whose effect was likely mediated by the compensation effect of ZmSPL6 from a downstream pathway.By integrating genetic diversity,environmental variation,and their interaction into a joint model,we could provide site-specific predictions with increased accuracy by as much as 9.9%,2.2%,and 2.6%for days to tassel,plant height,and ear weight,respectively.This study revealed a complex genetic architecture involving multiple alleles,pleiotropy,and genotype-byenvironment interaction that underlies variation in the mean and plasticity of maize complex traits.It provides novel insights into the dynamic genetic architecture of agronomic traits in response to changing environments,paving a practical way toward precision agriculture.展开更多
Many rice-growing areas are affected by high concentrations of arsenic(As).Rice varieties that prevent As uptake and/or accumulation can mitigate As threats to human health.Genomic selection is known to facilitate rap...Many rice-growing areas are affected by high concentrations of arsenic(As).Rice varieties that prevent As uptake and/or accumulation can mitigate As threats to human health.Genomic selection is known to facilitate rapid selection of superior genotypes for complex traits.We explored the predictive ability(PA)of genomic prediction with single-environment models,accounting or not for trait-specific markers,multi-environment models,and multi-trait and multi-environment models,using the genotypic(1600K SNPs)and phenotypic(grain As content,grain yield and days to flowering)data of the Bengal and Assam Aus Panel.Under the base-line single-environment model,PA of up to 0.707 and 0.654 was obtained for grain yield and grain As content,respectively;the three prediction methods(Bayesian Lasso,genomic best linear unbiased prediction and reproducing kernel Hilbert spaces)were considered to perform similarly,and marker selection based on linkage disequilibrium allowed to reduce the number of SNP to 17K,without negative effect on PA of genomic predictions.Single-environment models giving distinct weight to trait-specific markers in the genomic relationship matrix outperformed the base-line models up to 32%.Multi-environment models,accounting for genotype×environment interactions,and multi-trait and multi-environment models outperformed the base-line models by up to 47%and 61%,respectively.Among the multi-trait and multi-environment models,the Bayesian multi-output regressor stacking function obtained the highest predictive ability(0.831 for grain As)with much higher efficiency for computing time.These findings pave the way for breeding for As-tolerance in the progenies of biparental crosses involving members of the Bengal and Assam Aus Panel.Genomic prediction can also be applied to breeding for other complex traits under multiple environments.展开更多
The limited knowledge of genomic noncoding and regulatory regions has restricted our ability to decipher the genetic mechanisms underlying complex traits in pigs. In this study, we characterized the spatiotemporal lan...The limited knowledge of genomic noncoding and regulatory regions has restricted our ability to decipher the genetic mechanisms underlying complex traits in pigs. In this study, we characterized the spatiotemporal landscape of putative enhancers and promoters and their target genes by combining H3K27ac-targeted Ch IP-Seq and RNA-Seq in fetal(prenatal days 74–75) and adult(postnatal days 132–150) tissues(brain, liver, heart, muscle and small intestine) sampled from Asian aboriginal Bama Xiang and European highly selected Large White pigs of both sexes. We identified 101,290 H3K27ac peaks, marking 18,521promoters and 82,769 enhancers, including peaks that were active across all tissues and developmental stages(which could indicate safe harbor locus for exogenous gene insertion) and tissue-and developmental stage-specific peaks(which regulate gene pathways matching tissue-and developmental stage-specific physiological functions). We found that H3K27ac and DNA methylation in the promoter region of the XIST gene may be involved in X chromosome inactivation and demonstrated the utility of the present resource for revealing the regulatory patterns of known causal genes and prioritizing candidate causal variants for complex traits in pigs. In addition, we identified an average of 1,124 super-enhancers per sample and found that they were more likely to show tissue-specific activity than ordinary peaks. We have developed a web browser to improve the accessibility of the results(http://segtp.jxau.edu.cn/pencode/?genome=sus Scr11).展开更多
Altitude acclimatization is a human physiological process of adjusting to the decreased oxygen availability.Since several physiological processes are involved and their correlations are complicated,the analyses of sin...Altitude acclimatization is a human physiological process of adjusting to the decreased oxygen availability.Since several physiological processes are involved and their correlations are complicated,the analyses of single traits are insufficient in revealing the complex mechanism of high-altitude acclimatization.In this study,we examined these physiological responses as the composite phenotypes that are represented by a linear combination of physiological traits.We developed a strategy that combines both spectral clustering and partial least squares path modeling(PLSPM)to define composite phenotypes based on a cohort study of 883 Chinese Han males.In addition,we captured 14 composite phenotypes from 28 physiological traits of high-altitude acclimatization.Using these composite phenotypes,we applied k-means clustering to reveal hidden population physiological heterogeneity in high-altitude acclimatization.Furthermore,we employed multivariate linear regression to systematically model(Models 1 and 2)oxygen saturation(SpO_(2))changes in high-altitude acclimatization and evaluated model fitness performance.Composite phenotypes based on Model 2 fit better than single trait-based Model 1 in all measurement indices.This new strategy of using composite phenotypes may be potentially employed as a general strategy for complex traits research such as genetic loci discovery and analyses of phenomics.展开更多
Association analysis provides an opportunity to find genetic variants underlying complex traits. A principal components regression (PCR)-based approach was shown to outperform some competing approaches. However, a l...Association analysis provides an opportunity to find genetic variants underlying complex traits. A principal components regression (PCR)-based approach was shown to outperform some competing approaches. However, a limitation of this method is that the principal components (PCs) selected from single nucleotide polyrnorphisms (SNPs) may be unrelated to the phenotype. In this article, we investigate the theoretical properties of such a method in more detail. We first derive the exact power function of the test based on PCR, and hence clarify the relationship between the test power and the degrees of freedom (DF). Next, we extend the PCR test to a general weighted PCs test, which provides a unified framework for understanding the properties of some related statistics. We then compare the performance of these tests. We also introduce several data-driven adaptive alternatives to overcome difficulties in the PCR approach. Finally, we illustrate our results using simulations based on real genotype data. Simulation study shows the risk of using the unsupervised rule to determine the number of PCs, and demonstrates that there is no single uniformly powerful method for detecting genetic variants.展开更多
Long-term genetic studies utilizing backcross and congenic strain analyses coupled with positional cloning strategies and functional studies identified Cdkn2a,Mtor,and Mndal as mouse plasmacytoma susceptibility/resist...Long-term genetic studies utilizing backcross and congenic strain analyses coupled with positional cloning strategies and functional studies identified Cdkn2a,Mtor,and Mndal as mouse plasmacytoma susceptibility/resistance genes.Tumor incidence data in congenic strains carrying the resistance alleles of Cdkn2a and Mtor led us to hypothesize that drug combinations affecting these pathways are likely to have an additive,if not synergistic effect in inhibiting tumor cell growth.Traditional and novel systems-level genomic approaches were used to assess combination activity,disease specificity,and clinical potential of a drug combination involving rapamycin/everolimus,an Mtor inhibitor,with entinostat,an histone deacetylase inhibitor.The combination synergistically repressed oncogenic MYC and activated the Cdkn2a tumor suppressor.The identification of MYC as a primary upstream regulator led to the identification of small molecule binders of the G-quadruplex structure that forms in the NHEIII region of the MYC promoter.These studies highlight the importance of identifying drug combinations which simultaneously upregulate tumor suppressors and downregulate oncogenes.展开更多
Objectives To formulate an equation for fine mapping of disease loci under complex conditions and determine the marker-disease distance in a specific case using this equation. Methods Lewontin’s linkage disequi...Objectives To formulate an equation for fine mapping of disease loci under complex conditions and determine the marker-disease distance in a specific case using this equation. Methods Lewontin’s linkage disequilibrium (LD) measure D’ was used to formulate an equation for mapping disease genes in the presence of phenocopies, locus heterogeneity, gene-gene and gene-environment interactions, incomplete penetrance, uncertain liability and threshold, incomplete initial LD, natural selection, recurrent mutation, high disease allele frequency and unknown mode of inheritance. This equation was then used to determine the distance between a marker (ε4 within the apolipoprotein E gene, APOE) and Alzheimer’s disease (AD) loci using published data.Results An equation was formulated for mapping disease genes under the above conditions. If these conditions are present but ignored, then recombination fraction θ between marker and disease loci will be either overestimated or estimated with little bias. Therefore, an upper limit of θ can be obtained. AD has been found to be associated with the marker allele ε4 in Africans, Asians, and Caucasians. This suggests that the AD-ε4 allelic LD predates the divergence of peoples occurring 100?000 years ago. With the age of AD-ε4 allelic LD so estimated, the maximal distance was calculated to be 23.2 kb (mean 5.8 kb). Conclusions (1) A method is developed for LD mapping of susceptibility genes. (2) A mutation within the APOE gene itself, among others, is responsible for the susceptibility to AD, which is supported by recent evidence from studies using transgenic mice.展开更多
基金supported by the National Basic Research Program (973) of China (No. 2004CB117306)the Hi-Tech Research and Devel-opment Program (863) of China (No. 2006AA10A102)
文摘A method was proposed for the detection of outliers and influential observations in the framework of a mixed linear model, prior to the quantitative trait locus (QTL) mapping analysis. We investigated the impact of outliers on QTL mapping for complex traits in a mouse BXD population, and observed that the dropping of outliers could provide the evidence of additional QTL and epistatic loci affecting the 1stBrain-OB and the 2ndBrain-OB in a cross of the abovementioned population. The results could also reveal a remarkable increase in estimating heritabilities of QTL in the absence of outliers. In addition, simulations were conducted to investigate the detection powers and false discovery rates (FDRs) of QTLs in the presence and absence of outliers. The results suggested that the presence of a small proportion of outliers could increase the FDR and hence decrease the detection power of QTLs. A drastic increase could be obtained in the estimates of standard errors for position, additive and additive× environment interaction effects of QTLs in the presence of outliers.
基金the National Basic Research Program of China (2006CB 101700) the National Natural Science Foundation of China (30370758)+1 种基金 Program for New Century Excellent Talents in University, Ministry of Education of China (NCET-05-0502) the Natural Science Foundation of Jiangsu Province of China to Xu Chenwu (BK2006066).
文摘Complex traits are the features whose properties are determined by both genetic and environmental factors. Generally, complex traits include the classical quantitative traits with continuous distribution, the binary or categorical traits with discrete distribution controlled by polygene and other traits that cannot be measured exactly, such as behavior and psychology. Most human complex diseases and most economically important traits in plants and animals belong to the category. Understanding the molecular basis of complex traits plays a vital role in the genetic improvement of plant and animal breeding. In this article, the conception and research background of complex traits were summarized, and the strategies, methods and the great progress that had been made in dissecting genetic basis of complex traits were reviewed. The challenges and possible developments in future researches were also discussed.
基金supported by the National Natural Science Foundation of China(32370723,32000410)。
文摘Despite considerable advances in extracting crucial insights from bio-omics data to unravel the intricate mechanisms underlying complex traits,the absence of a universal multi-modal computational tool with robust interpretability for accurate phenotype prediction and identification of trait-associated genes remains a challenge.This study introduces the dual-extraction modeling(DEM)approach,a multi-modal deep-learning architecture designed to extract representative features from heterogeneous omics datasets,enabling the prediction of complex trait phenotypes.Through comprehensive benchmarking experiments,we demonstrate the efficacy of DEM in classification and regression prediction of complex traits.DEM consistently exhibits superior accuracy,robustness,generalizability,and flexibility.Notably,we establish its effectiveness in predicting pleiotropic genes that influence both flowering time and rosette leaf number,underscoring its commendable interpretability.In addition,we have developed user-friendly software to facilitate seamless utilization of DEM’s functions.In summary,this study presents a state-of-the-art approach with the ability to effectively predict qualitative and quantitative traits and identify functional genes,confirming its potential as a valuable tool for exploring the genetic basis of complex traits.
基金Funding XS was in receipt of a National Natural Science Foundation of China(NSFC)grant(No.12171495)a Natural Science Foundation of Guangdong Province grant(No.2114050001435)+3 种基金a National Key Research and Development Program grant(No.2022YFF1202105)Swedish Research Council(Vetenskapsraet)grants(No.2017-02543&No.2022-01309)supported by the Swedish Research Council grant(No.2017-02543)XS The Swedish National Infrastructure for Computing(SNIC)utilized was partially funded by the Swedish Research Council through grant agreement No.2018-05973.
文摘Alternative splicing exists in most multi-exonic genes,and exploring these complex alternative splicing events and their resultant isoform expressions is essential.However,it has become conventional that RNA sequencing results have often been summarized into gene-level expression counts mainly due to the multiple ambiguous mapping of reads at highly similar regions.Transcript-level quantification and interpretation are often overlooked,and biological interpretations are often deduced based on combined transcript information at the gene level.Here,for the most variable tissue of alternative splicing,the brain,we estimate isoform expressions in 1,191 samples collected by the Genotype-Tissue Expression(GTEx)Consortium using a powerful method that we previously developed.We perform genome-wide association scans on the isoform ratios per gene and identify isoform-ratio quantitative trait loci(irQTL),which could not be detected by studying gene-level expressions alone.By analyzing the genetic architecture of the irQTL,we show that isoform ratios regulate edu-cational attainment via multiple tissues including the frontal cortex(BA9),cortex,cervical spinal cord,and hippocampus.These tissues are also associated with different neuro-related traits,including Alzheimer’s or dementia,mood swings,sleep duration,alcohol intake,intelligence,anxiety or depression,etc.Mendelian randomization(MR)analysis revealed 1,139 pairs of isoforms and neuro-related traits with plausible causal relationships,showing much stronger causal effects than on general diseases measured in the UK Biobank(UKB).Our results highlight essential transcript-level biomarkers in the human brain for neuro-related complex traits and diseases,which could be missed by merely investigating overall gene expressions.
基金supported by the National Natural Science Foundation of China(81171880)the National Basic Research Program of China(2011CB51001 to S.Huang)the GeNeSys Consortium(to O.Goldmann and E.Medina
文摘It has long been assumed that most parts of a genome and most genetic variations or SNPs are non-functional with regard to reproductive fitness.However,the collective effects of SNPs have yet to be examined by experimental science.We here developed a novel approach to examine the relationship between traits and the total amount of SNPs in panels of genetic reference populations.We identified the minor alleles(MAs)in each panel and the MA content(MAC)that each inbred strain carried for a set of SNPs with genotypes determined in these panels.MAC was nearly linearly linked to quantitative variations in numerous traits in model organisms,including life span,tumor susceptibility,learning and memory,sensitivity to alcohol and anti-psychotic drugs,and two correlated traits poor reproductive fitness and strong immunity.These results suggest that the collective effects of SNPs are functional and do affect reproductive fitness.
基金funded by the Natural Science Foundation of China(31961133002,31901553,and 31771879)the National Key Research and Development Program of China(2020YFE0202300)+3 种基金the Science and Technology Major Program of Hubei Province(2021ABA011)the Swedish Research Council for Environment,Agricultural Sciences,and Spatial Planning(2019-01600)the Key Science and Technology Project of the China National Tobacco Corporation(110202101040 JY-17)the Jilin Scientific and Technological Development Program(20190201290JC).
文摘Phenotypic plasticity is the ability of a given genotype to produce multiple phenotypes in response to changing environmental conditions.Understanding the genetic basis of phenotypic plasticity and establishing a predictive model is highly relevant to future agriculture under a changing climate.Here we report findings on the genetic basis of phenotypic plasticity for 23 complex traits using a diverse maize population planted at five sites with distinct environmental conditions.We found that latituderelated environmental factors were the main drivers of across-site variation in flowering time traits but not in plant architecture or yield traits.For the 23 traits,we detected 109 quantitative trait loci(QTLs),29 for mean values,66 for plasticity,and 14 for both parameters,and 80%of the QTLs interacted with latitude.The effects of several QTLs changed in magnitude or sign,driving variation in phenotypic plasticity.We experimentally validated one plastic gene,ZmTPS14.1,whose effect was likely mediated by the compensation effect of ZmSPL6 from a downstream pathway.By integrating genetic diversity,environmental variation,and their interaction into a joint model,we could provide site-specific predictions with increased accuracy by as much as 9.9%,2.2%,and 2.6%for days to tassel,plant height,and ear weight,respectively.This study revealed a complex genetic architecture involving multiple alleles,pleiotropy,and genotype-byenvironment interaction that underlies variation in the mean and plasticity of maize complex traits.It provides novel insights into the dynamic genetic architecture of agronomic traits in response to changing environments,paving a practical way toward precision agriculture.
文摘Many rice-growing areas are affected by high concentrations of arsenic(As).Rice varieties that prevent As uptake and/or accumulation can mitigate As threats to human health.Genomic selection is known to facilitate rapid selection of superior genotypes for complex traits.We explored the predictive ability(PA)of genomic prediction with single-environment models,accounting or not for trait-specific markers,multi-environment models,and multi-trait and multi-environment models,using the genotypic(1600K SNPs)and phenotypic(grain As content,grain yield and days to flowering)data of the Bengal and Assam Aus Panel.Under the base-line single-environment model,PA of up to 0.707 and 0.654 was obtained for grain yield and grain As content,respectively;the three prediction methods(Bayesian Lasso,genomic best linear unbiased prediction and reproducing kernel Hilbert spaces)were considered to perform similarly,and marker selection based on linkage disequilibrium allowed to reduce the number of SNP to 17K,without negative effect on PA of genomic predictions.Single-environment models giving distinct weight to trait-specific markers in the genomic relationship matrix outperformed the base-line models up to 32%.Multi-environment models,accounting for genotype×environment interactions,and multi-trait and multi-environment models outperformed the base-line models by up to 47%and 61%,respectively.Among the multi-trait and multi-environment models,the Bayesian multi-output regressor stacking function obtained the highest predictive ability(0.831 for grain As)with much higher efficiency for computing time.These findings pave the way for breeding for As-tolerance in the progenies of biparental crosses involving members of the Bengal and Assam Aus Panel.Genomic prediction can also be applied to breeding for other complex traits under multiple environments.
基金supported by the National Natural Science Foundation of China (31790413, 31760657)。
文摘The limited knowledge of genomic noncoding and regulatory regions has restricted our ability to decipher the genetic mechanisms underlying complex traits in pigs. In this study, we characterized the spatiotemporal landscape of putative enhancers and promoters and their target genes by combining H3K27ac-targeted Ch IP-Seq and RNA-Seq in fetal(prenatal days 74–75) and adult(postnatal days 132–150) tissues(brain, liver, heart, muscle and small intestine) sampled from Asian aboriginal Bama Xiang and European highly selected Large White pigs of both sexes. We identified 101,290 H3K27ac peaks, marking 18,521promoters and 82,769 enhancers, including peaks that were active across all tissues and developmental stages(which could indicate safe harbor locus for exogenous gene insertion) and tissue-and developmental stage-specific peaks(which regulate gene pathways matching tissue-and developmental stage-specific physiological functions). We found that H3K27ac and DNA methylation in the promoter region of the XIST gene may be involved in X chromosome inactivation and demonstrated the utility of the present resource for revealing the regulatory patterns of known causal genes and prioritizing candidate causal variants for complex traits in pigs. In addition, we identified an average of 1,124 super-enhancers per sample and found that they were more likely to show tissue-specific activity than ordinary peaks. We have developed a web browser to improve the accessibility of the results(http://segtp.jxau.edu.cn/pencode/?genome=sus Scr11).
基金supported by Shanghai Municipal Science and Technology Major Project(2017SHZDZX01)National Science Foundation of China(31330038)+5 种基金CAMS Innovation Fund for Medical Sciences(2019-I2M-5-066)Science and Technology Committee of Shanghai Municipality(16JC1400500)Ministry of Science and Technology(2015FY1117000)the 111 Project(B13016)Major Project of Special Development Funds of Zhangjiang National Independent Innovation Demonstration Zone(ZJ2019-ZD-004)supported by the Postdoctoral Science Foundation of China(2018M640333).
文摘Altitude acclimatization is a human physiological process of adjusting to the decreased oxygen availability.Since several physiological processes are involved and their correlations are complicated,the analyses of single traits are insufficient in revealing the complex mechanism of high-altitude acclimatization.In this study,we examined these physiological responses as the composite phenotypes that are represented by a linear combination of physiological traits.We developed a strategy that combines both spectral clustering and partial least squares path modeling(PLSPM)to define composite phenotypes based on a cohort study of 883 Chinese Han males.In addition,we captured 14 composite phenotypes from 28 physiological traits of high-altitude acclimatization.Using these composite phenotypes,we applied k-means clustering to reveal hidden population physiological heterogeneity in high-altitude acclimatization.Furthermore,we employed multivariate linear regression to systematically model(Models 1 and 2)oxygen saturation(SpO_(2))changes in high-altitude acclimatization and evaluated model fitness performance.Composite phenotypes based on Model 2 fit better than single trait-based Model 1 in all measurement indices.This new strategy of using composite phenotypes may be potentially employed as a general strategy for complex traits research such as genetic loci discovery and analyses of phenomics.
基金supported by the National Basic Research Program (973) of China (No. 2004CB117306)the Hi-Tech Research and Development Program (863) of China (No. 2006AA10A102)
文摘Association analysis provides an opportunity to find genetic variants underlying complex traits. A principal components regression (PCR)-based approach was shown to outperform some competing approaches. However, a limitation of this method is that the principal components (PCs) selected from single nucleotide polyrnorphisms (SNPs) may be unrelated to the phenotype. In this article, we investigate the theoretical properties of such a method in more detail. We first derive the exact power function of the test based on PCR, and hence clarify the relationship between the test power and the degrees of freedom (DF). Next, we extend the PCR test to a general weighted PCs test, which provides a unified framework for understanding the properties of some related statistics. We then compare the performance of these tests. We also introduce several data-driven adaptive alternatives to overcome difficulties in the PCR approach. Finally, we illustrate our results using simulations based on real genotype data. Simulation study shows the risk of using the unsupervised rule to determine the number of PCs, and demonstrates that there is no single uniformly powerful method for detecting genetic variants.
基金the Intramural Research Program of the National Institutes of Health,National Cancer Institute,Center for Cancer Research and the MMRF(Multiple Myeloma Research Foundation).
文摘Long-term genetic studies utilizing backcross and congenic strain analyses coupled with positional cloning strategies and functional studies identified Cdkn2a,Mtor,and Mndal as mouse plasmacytoma susceptibility/resistance genes.Tumor incidence data in congenic strains carrying the resistance alleles of Cdkn2a and Mtor led us to hypothesize that drug combinations affecting these pathways are likely to have an additive,if not synergistic effect in inhibiting tumor cell growth.Traditional and novel systems-level genomic approaches were used to assess combination activity,disease specificity,and clinical potential of a drug combination involving rapamycin/everolimus,an Mtor inhibitor,with entinostat,an histone deacetylase inhibitor.The combination synergistically repressed oncogenic MYC and activated the Cdkn2a tumor suppressor.The identification of MYC as a primary upstream regulator led to the identification of small molecule binders of the G-quadruplex structure that forms in the NHEIII region of the MYC promoter.These studies highlight the importance of identifying drug combinations which simultaneously upregulate tumor suppressors and downregulate oncogenes.
文摘Objectives To formulate an equation for fine mapping of disease loci under complex conditions and determine the marker-disease distance in a specific case using this equation. Methods Lewontin’s linkage disequilibrium (LD) measure D’ was used to formulate an equation for mapping disease genes in the presence of phenocopies, locus heterogeneity, gene-gene and gene-environment interactions, incomplete penetrance, uncertain liability and threshold, incomplete initial LD, natural selection, recurrent mutation, high disease allele frequency and unknown mode of inheritance. This equation was then used to determine the distance between a marker (ε4 within the apolipoprotein E gene, APOE) and Alzheimer’s disease (AD) loci using published data.Results An equation was formulated for mapping disease genes under the above conditions. If these conditions are present but ignored, then recombination fraction θ between marker and disease loci will be either overestimated or estimated with little bias. Therefore, an upper limit of θ can be obtained. AD has been found to be associated with the marker allele ε4 in Africans, Asians, and Caucasians. This suggests that the AD-ε4 allelic LD predates the divergence of peoples occurring 100?000 years ago. With the age of AD-ε4 allelic LD so estimated, the maximal distance was calculated to be 23.2 kb (mean 5.8 kb). Conclusions (1) A method is developed for LD mapping of susceptibility genes. (2) A mutation within the APOE gene itself, among others, is responsible for the susceptibility to AD, which is supported by recent evidence from studies using transgenic mice.