A central goal of genetics is to understand the links between genetic variation and disease.Intuitively,one might expect diseasecausing variants to cluster into key pathways that drive disease etiology.But for complex...A central goal of genetics is to understand the links between genetic variation and disease.Intuitively,one might expect diseasecausing variants to cluster into key pathways that drive disease etiology.But for complex traits,association signals tend to be spread across most of the genome-including near many genes without an obvious connection to disease.展开更多
Using newly developed methods and software, association mapping was conducted for chromium content and total sugar in tobacco leaf, based on four-omics datasets. Our objective was to collect data on genotype and pheno...Using newly developed methods and software, association mapping was conducted for chromium content and total sugar in tobacco leaf, based on four-omics datasets. Our objective was to collect data on genotype and phenotype for 60 leaf samples at four developmental stages, from three plant architectural positions and for three cultivars that were grown in two locations. Association mapping was conducted to detect genetic variants at quantitative trait SNP(QTS) loci, quantitative trait transcript(QTT) differences,quantitative trait protein(QTP) variability, and quantitative trait metabolite(QTM) changes,which can be summarized as QTX locus variation. The total heritabilities of the four-omics loci for both traits tested were 23.60% for epistasis and 15.26% for treatment interaction.Epistasis and environment × treatment interaction had important impacts on complex traits at all-omics levels. For decreasing chromium content and increasing total sugar in tobacco leaf, six methylated loci can be directly used for marker-assisted selection, and expression of ten QTTs, seven QTPs and six QTMs can be modified by selection or cultivation.展开更多
Genome-wide association study(GWAS)has been a standard approach to discover the genetic determinants underlying complex traits.It is a major challenge in GWAS how to improve analysis power,uncover complex genetic corr...Genome-wide association study(GWAS)has been a standard approach to discover the genetic determinants underlying complex traits.It is a major challenge in GWAS how to improve analysis power,uncover complex genetic correlation,and reveal gene-gene and gene-environment interactions through integrated analysis of multiple genetically related traits.To combat these challenges,we proposed a mixed linear model-based joint association analysis method for multiple traits,which include epistasis and geneenvironment interaction in the mapping model and utilize within-trait variance and between-trait covariance simultaneously;A F-statistics based on Wilks statistics is used to test the significance of each SNP and paired interacted SNPs,each genetic effects of QTS are estimated and tested by the MCMC method based on a QTS full model.Simulations showed that the multi-trait GWAS method could provide increased power in detecting pleiotropic loci affecting more than one trait,and can unbiasedly estimate effects of QTS.To demonstrate the performance of the proposed method,we analyzed four blood lipid traits in Multi-Ethnic Study of Atherosclerosis(MESA)Cohort and two yield-related traits in a rice immortalized F2 dataset.A software package was developed for the proposed method.展开更多
Biodiversity declines have motivated many studies on the relationship between species diversity and ecosystem functioning.In this study,we described the spatial-temporal characteristics of demersal fish communities al...Biodiversity declines have motivated many studies on the relationship between species diversity and ecosystem functioning.In this study,we described the spatial-temporal characteristics of demersal fish communities along a coastal habitat in Rongcheng Bay,Shandong Peninsula,China with both species-based and biological trait-based approaches.The field survey was carried out monthly using traps from April to October of 2018,and divided into three seasons(spring:April and May;summer:June,July and August;autumn:September,October and November).The study area included five distinct habitats:seagrass bed,natural rocky reef,bare sand,artificial reef together with natural rocky reef,and artificial reef together with bare sand.We analyzed the fish communities with three taxonomic diversity indices,including Shannon-Wiener,Simpson,and Pielou Evenness,as well as four functional diversity indices,including FRic,FEve,FDiv,and FDis,based on 7 functional groups which are categorized into 27 traits.The results showed that there were no significant differences in taxonomic diversity indices among different habitats in the three seasons.However,significant differences were found in the functional richness of fish communities among different habitats in three seasons.Seagrass bed represented the highest functional richness in spring and autumn.This study demonstrates that seagrass bed is very important in enhancing the functional diversity of fish communities in a complex habitat.The study also indicates that the combination of taxonomic diversity and functional diversity will provide a more detailed description of the characteristics of fish communities.展开更多
Complex traits are the features whose properties are determined by multiple factors, which can be genetic or environmental. Most of economically important characteristics of plants and animals belong to this special ...Complex traits are the features whose properties are determined by multiple factors, which can be genetic or environmental. Most of economically important characteristics of plants and animals belong to this special catego-展开更多
A method was proposed for the detection of outliers and influential observations in the framework of a mixed linear model, prior to the quantitative trait locus (QTL) mapping analysis. We investigated the impact of ou...A method was proposed for the detection of outliers and influential observations in the framework of a mixed linear model, prior to the quantitative trait locus (QTL) mapping analysis. We investigated the impact of outliers on QTL mapping for complex traits in a mouse BXD population, and observed that the dropping of outliers could provide the evidence of additional QTL and epistatic loci affecting the 1stBrain-OB and the 2ndBrain-OB in a cross of the abovementioned population. The results could also reveal a remarkable increase in estimating heritabilities of QTL in the absence of outliers. In addition, simulations were conducted to investigate the detection powers and false discovery rates (FDRs) of QTLs in the presence and absence of outliers. The results suggested that the presence of a small proportion of outliers could increase the FDR and hence decrease the detection power of QTLs. A drastic increase could be obtained in the estimates of standard errors for position, additive and additive× environment interaction effects of QTLs in the presence of outliers.展开更多
Complex traits are the features whose properties are determined by both genetic and environmental factors. Generally, complex traits include the classical quantitative traits with continuous distribution, the binary o...Complex traits are the features whose properties are determined by both genetic and environmental factors. Generally, complex traits include the classical quantitative traits with continuous distribution, the binary or categorical traits with discrete distribution controlled by polygene and other traits that cannot be measured exactly, such as behavior and psychology. Most human complex diseases and most economically important traits in plants and animals belong to the category. Understanding the molecular basis of complex traits plays a vital role in the genetic improvement of plant and animal breeding. In this article, the conception and research background of complex traits were summarized, and the strategies, methods and the great progress that had been made in dissecting genetic basis of complex traits were reviewed. The challenges and possible developments in future researches were also discussed.展开更多
The Collaborative Cross(CC)mouse model is a next‐generation mouse genetic reference population(GRP)designated for a high‐resolution quantitative trait loci(QTL)mapping of complex traits during health and disease.The...The Collaborative Cross(CC)mouse model is a next‐generation mouse genetic reference population(GRP)designated for a high‐resolution quantitative trait loci(QTL)mapping of complex traits during health and disease.The CC lines were generated from reciprocal crosses of eight divergent mouse founder strains composed of five classical and three wild‐derived strains.Complex traits are defined to be controlled by variations within multiple genes and the gene/environment interactions.In this article,we introduce and present variety of protocols and results of studying the host response to infectious and chronic diseases,including type 2 diabetes and metabolic diseases,body composition,immune response,colorectal cancer,susceptibility to Aspergillus fumigatus,Klebsiella pneumoniae,Pseudomonas aeruginosa,sepsis,and mixed infections of Porphyromonas gingivalis and Fusobacterium nucleatum,which were conducted at our laboratory using the CC mouse population.These traits are observed at multiple levels of the body systems,including metabolism,body weight,immune profile,susceptibility or resistance to the development and progress of infectious or chronic diseases.Herein,we present full protocols and step‐by‐step methods,implemented in our laboratory for the phenotypic and genotypic characterization of the different CC lines,mapping the gene underlying the host response to these infections and chronic diseases.The CC mouse model is a unique and powerful GRP for dissecting the host genetic architectures underlying complex traits,including chronic and infectious diseases.展开更多
Phenotypic plasticity is the ability of a given genotype to produce multiple phenotypes in response to changing environmental conditions.Understanding the genetic basis of phenotypic plasticity and establishing a pred...Phenotypic plasticity is the ability of a given genotype to produce multiple phenotypes in response to changing environmental conditions.Understanding the genetic basis of phenotypic plasticity and establishing a predictive model is highly relevant to future agriculture under a changing climate.Here we report findings on the genetic basis of phenotypic plasticity for 23 complex traits using a diverse maize population planted at five sites with distinct environmental conditions.We found that latituderelated environmental factors were the main drivers of across-site variation in flowering time traits but not in plant architecture or yield traits.For the 23 traits,we detected 109 quantitative trait loci(QTLs),29 for mean values,66 for plasticity,and 14 for both parameters,and 80%of the QTLs interacted with latitude.The effects of several QTLs changed in magnitude or sign,driving variation in phenotypic plasticity.We experimentally validated one plastic gene,ZmTPS14.1,whose effect was likely mediated by the compensation effect of ZmSPL6 from a downstream pathway.By integrating genetic diversity,environmental variation,and their interaction into a joint model,we could provide site-specific predictions with increased accuracy by as much as 9.9%,2.2%,and 2.6%for days to tassel,plant height,and ear weight,respectively.This study revealed a complex genetic architecture involving multiple alleles,pleiotropy,and genotype-byenvironment interaction that underlies variation in the mean and plasticity of maize complex traits.It provides novel insights into the dynamic genetic architecture of agronomic traits in response to changing environments,paving a practical way toward precision agriculture.展开更多
Alternative splicing exists in most multi-exonic genes,and exploring these complex alternative splicing events and their resultant isoform expressions is essential.However,it has become conventional that RNA sequencin...Alternative splicing exists in most multi-exonic genes,and exploring these complex alternative splicing events and their resultant isoform expressions is essential.However,it has become conventional that RNA sequencing results have often been summarized into gene-level expression counts mainly due to the multiple ambiguous mapping of reads at highly similar regions.Transcript-level quantification and interpretation are often overlooked,and biological interpretations are often deduced based on combined transcript information at the gene level.Here,for the most variable tissue of alternative splicing,the brain,we estimate isoform expressions in 1,191 samples collected by the Genotype-Tissue Expression(GTEx)Consortium using a powerful method that we previously developed.We perform genome-wide association scans on the isoform ratios per gene and identify isoform-ratio quantitative trait loci(irQTL),which could not be detected by studying gene-level expressions alone.By analyzing the genetic architecture of the irQTL,we show that isoform ratios regulate edu-cational attainment via multiple tissues including the frontal cortex(BA9),cortex,cervical spinal cord,and hippocampus.These tissues are also associated with different neuro-related traits,including Alzheimer’s or dementia,mood swings,sleep duration,alcohol intake,intelligence,anxiety or depression,etc.Mendelian randomization(MR)analysis revealed 1,139 pairs of isoforms and neuro-related traits with plausible causal relationships,showing much stronger causal effects than on general diseases measured in the UK Biobank(UKB).Our results highlight essential transcript-level biomarkers in the human brain for neuro-related complex traits and diseases,which could be missed by merely investigating overall gene expressions.展开更多
Many rice-growing areas are affected by high concentrations of arsenic(As).Rice varieties that prevent As uptake and/or accumulation can mitigate As threats to human health.Genomic selection is known to facilitate rap...Many rice-growing areas are affected by high concentrations of arsenic(As).Rice varieties that prevent As uptake and/or accumulation can mitigate As threats to human health.Genomic selection is known to facilitate rapid selection of superior genotypes for complex traits.We explored the predictive ability(PA)of genomic prediction with single-environment models,accounting or not for trait-specific markers,multi-environment models,and multi-trait and multi-environment models,using the genotypic(1600K SNPs)and phenotypic(grain As content,grain yield and days to flowering)data of the Bengal and Assam Aus Panel.Under the base-line single-environment model,PA of up to 0.707 and 0.654 was obtained for grain yield and grain As content,respectively;the three prediction methods(Bayesian Lasso,genomic best linear unbiased prediction and reproducing kernel Hilbert spaces)were considered to perform similarly,and marker selection based on linkage disequilibrium allowed to reduce the number of SNP to 17K,without negative effect on PA of genomic predictions.Single-environment models giving distinct weight to trait-specific markers in the genomic relationship matrix outperformed the base-line models up to 32%.Multi-environment models,accounting for genotype×environment interactions,and multi-trait and multi-environment models outperformed the base-line models by up to 47%and 61%,respectively.Among the multi-trait and multi-environment models,the Bayesian multi-output regressor stacking function obtained the highest predictive ability(0.831 for grain As)with much higher efficiency for computing time.These findings pave the way for breeding for As-tolerance in the progenies of biparental crosses involving members of the Bengal and Assam Aus Panel.Genomic prediction can also be applied to breeding for other complex traits under multiple environments.展开更多
全基因组关联分析(genome-wide association study,GWAS)是定位基因组中与性状显著关联的变异位点的有效方法。随着表型记录的完善、高通量基因型分型技术的发展,以及统计方法的改进,全基因组关联分析在人类疾病、动物植物遗传等领域得...全基因组关联分析(genome-wide association study,GWAS)是定位基因组中与性状显著关联的变异位点的有效方法。随着表型记录的完善、高通量基因型分型技术的发展,以及统计方法的改进,全基因组关联分析在人类疾病、动物植物遗传等领域得到了广泛的应用。假阳性是影响全基因组关联分析结果可靠性的重要因素之一。为了控制假阳性,除了校正P值,GWAS模型从最简单的方差分析(或用于质量性状的卡方检验)到加入固定效应协变量的普通线性模型(general linear model,GLM),再到加入随机效应的混合线性模型(mixed linear model,MLM)持续改进,控制了多种混杂因素导致的假阳性。将个体的遗传效应拟合为由基因组亲缘关系矩阵(genomic relationships matrix,GRM)定义的随机效应是目前常用的方法。由于MLM的参数估计大量消耗计算资源,研究人员不断尝试模型求解优化和GRM的构建优化(GRM的构建优化同时也提高了计算效率),最终将基于MLM计算的时间复杂度由O(MN3)逐步改进到O(MN),实现了计算速度与统计功效的飞跃。针对质量性状病例对照比失衡带来的假阳性问题,研究人员进一步对广义混合线性模型(generalized linear mixed model,GLMM)进行了校正。本文较全面地介绍了GWAS的基本原理和发展,着重阐述了GWAS中MLM模型的改进和优化细节,同时,列举了GWAS在农业中的应用,包括在植物、动物和微生物方面的研究成果,以及基于单倍型的GWAS应用。最后,从进一步提高GWAS统计功效和GWAS试验设计2个角度对GWAS未来的发展进行了展望。展开更多
The Quantitative Genetic Analysis Station (QGAStation) is a software package that has been developed to perform statistical analysis for complex traits.It consists of five domains for handling data from diallel crosse...The Quantitative Genetic Analysis Station (QGAStation) is a software package that has been developed to perform statistical analysis for complex traits.It consists of five domains for handling data from diallel crosses,regional trials,core germplasm collections,QTL mapping,and microarray experiments.The first domain contains genetic models for diallel cross analysis,in which genetic variance components and genetic-by-environment interactions can be estimated,and genetic effects can be predicted.The second domain evaluates the performance of varieties in regional trials by implementing a general statistical method that outperforms ANOVA in tackling unbalanced data that arises frequently in trials across multiple locations and over a number of years.The third domain,using predicted genotypic values as proxy,constructs core germplasm collections covering sufficient genetic diversity with lower redundancy.The fourth domain manages genotypic and phenotypic data for QTL mapping.Linkage maps can be constructed and genetic distances can be estimated;the statistical methods that have been implemented apply to both chiasmatic and achiasmatic organisms.Another part of this domain can filter systematic noises in phenotypic data.The fifth domain focuses on the cDNA expression data that is generated by microarray experiments.A two-step strategy has been implemented to detect differentially expressed genes and to estimate their effects.Except in the fourth domain,the major statistical methods that have been used are mixed linear model approaches that have been implemented in the C language.Computational efficiency is further boosted for computers that are equipped with graphics processing units (GPUs).A user friendly graphic interface is provided for Microsoft Windows and Apple Mac operating systems.QGAStation is available at http://ibi.zju.edu.cn/software/qga/.展开更多
A promising way to uncover the genetic architectures underlying complex traits may lie in the ability to recognize the genetic variants and expression transcripts that are responsible for the traits' inheritance.H...A promising way to uncover the genetic architectures underlying complex traits may lie in the ability to recognize the genetic variants and expression transcripts that are responsible for the traits' inheritance.However,statistical methods capable of investigating the association between the inheritance of a quantitative trait and expression transcripts are still limited.In this study,we described a two-step approach that we developed to evaluate the contribution of expression transcripts to the inheritance of a complex trait.First,a mixed linear model approach was applied to detect significant trait-associated differentially expressed transcripts.Then,conditional analysis were used to predict the contribution of the differentially expressed genes to a target trait.Diallel cross data of cotton was used to test the application of the approach.We proposed that the detected differentially expressed transcripts with a strong impact on the target trait could be used as intermediates for screening lines to improve the traits in plant and animal breeding programs.It can benefit the discovery of the genetic mechanisms underlying complex traits.展开更多
Most of the important agronomic traits in crops,such as yield and quality,are complex traits affected by multiple genes with gene × gene interaction as well as gene × environment interaction.Understanding th...Most of the important agronomic traits in crops,such as yield and quality,are complex traits affected by multiple genes with gene × gene interaction as well as gene × environment interaction.Understanding the genetic architecture of complex traits is a long-term task for quantitative geneticists and plant breeders who wish to design efficient breeding programs.Conventionally,the genetic properties of traits can be revealed by partitioning the total variation into variation components caused by specific genetic effects.With recent advances in molecular genotyping and high-throughput technology,the unraveling of the genetic architecture of complex traits by analyzing quantitative trait locus (QTL) has become possible.The improvement of complex traits has also been achieved by pyramiding individual QTL.In this review,we describe some statistical methods for QTL mapping that can be used to analyze QTL × QTL interaction and QTL × environment interaction,and discuss their applications in crop breeding for complex traits.展开更多
It has long been assumed that most parts of a genome and most genetic variations or SNPs are non-functional with regard to reproductive fitness.However,the collective effects of SNPs have yet to be examined by experim...It has long been assumed that most parts of a genome and most genetic variations or SNPs are non-functional with regard to reproductive fitness.However,the collective effects of SNPs have yet to be examined by experimental science.We here developed a novel approach to examine the relationship between traits and the total amount of SNPs in panels of genetic reference populations.We identified the minor alleles(MAs)in each panel and the MA content(MAC)that each inbred strain carried for a set of SNPs with genotypes determined in these panels.MAC was nearly linearly linked to quantitative variations in numerous traits in model organisms,including life span,tumor susceptibility,learning and memory,sensitivity to alcohol and anti-psychotic drugs,and two correlated traits poor reproductive fitness and strong immunity.These results suggest that the collective effects of SNPs are functional and do affect reproductive fitness.展开更多
Chromosome segment substitution lines have been created in several experimental models,including many plant and animal species,and are useful tools for the genetic analysis and mapping of complex traits.The traditiona...Chromosome segment substitution lines have been created in several experimental models,including many plant and animal species,and are useful tools for the genetic analysis and mapping of complex traits.The traditional t-test is usually applied to identify a quantitative trait locus (QTL) that is contained within a chromosome segment to estimate the QTL's effect.However,current methods cannot uncover the entire genetic structure of complex traits.For example,current methods cannot distinguish between main effects and epistatic effects.In this paper,a linear epistatic model was constructed to dissect complex traits.First,all the long substituted segments were divided into overlapping small bins,and each small bin was considered a unique independent variable.The genetic model for complex traits was then constructed.When considering all the possible main effects and epistatic effects,the dimensions of the linear model can become extremely high.Therefore,variable selection via stepwise regression (Bin-REG) was proposed for the epistatic QTL analysis in the present study.Furthermore,we tested the feasibility of using the LASSO (least absolute shrinkage and selection operator) algorithm to estimate epistatic effects,examined the fully Bayesian SSVS (stochastic search variable selection) approach,tested the empirical Bayes (E-BAYES) method,and evaluated the penalized likelihood (PENAL) method for mapping epistatic QTLs.Simulation studies suggested that all of the above methods,excluding the LASSO and PENAL approaches,performed satisfactorily.The Bin-REG method appears to outperform all other methods in terms of estimating positions and effects.展开更多
文摘A central goal of genetics is to understand the links between genetic variation and disease.Intuitively,one might expect diseasecausing variants to cluster into key pathways that drive disease etiology.But for complex traits,association signals tend to be spread across most of the genome-including near many genes without an obvious connection to disease.
基金supported by the National Basic Research Program of China (2011CB109306 and 2009CB118404)the Program of Introducing Talents of Discipline to Universities of China ("111" Project, B06014)Research Programs (CNTC-D2011100, CNTC-[2012]146, NY-[2011]3047, QKHRZ [2013] 02)
文摘Using newly developed methods and software, association mapping was conducted for chromium content and total sugar in tobacco leaf, based on four-omics datasets. Our objective was to collect data on genotype and phenotype for 60 leaf samples at four developmental stages, from three plant architectural positions and for three cultivars that were grown in two locations. Association mapping was conducted to detect genetic variants at quantitative trait SNP(QTS) loci, quantitative trait transcript(QTT) differences,quantitative trait protein(QTP) variability, and quantitative trait metabolite(QTM) changes,which can be summarized as QTX locus variation. The total heritabilities of the four-omics loci for both traits tested were 23.60% for epistasis and 15.26% for treatment interaction.Epistasis and environment × treatment interaction had important impacts on complex traits at all-omics levels. For decreasing chromium content and increasing total sugar in tobacco leaf, six methylated loci can be directly used for marker-assisted selection, and expression of ten QTTs, seven QTPs and six QTMs can be modified by selection or cultivation.
基金The study was supported by the National Key Research and Development Program of China(2016YFC1303300)National Natural Science Foundation of China(31671570,31871707)+1 种基金National Science Foundation(DMS2002865)the 111 Project(BP2018021).The funders had no role in study design and data analysis.The authors thank the investigators,the staff,and the participants of the MESA(The Multi-Ethnic Study of Atherosclerosis)for their valuable contributions.A full list of participating MESA investigators and institutions can be found at http://www.mesa-nhlbi.org.
文摘Genome-wide association study(GWAS)has been a standard approach to discover the genetic determinants underlying complex traits.It is a major challenge in GWAS how to improve analysis power,uncover complex genetic correlation,and reveal gene-gene and gene-environment interactions through integrated analysis of multiple genetically related traits.To combat these challenges,we proposed a mixed linear model-based joint association analysis method for multiple traits,which include epistasis and geneenvironment interaction in the mapping model and utilize within-trait variance and between-trait covariance simultaneously;A F-statistics based on Wilks statistics is used to test the significance of each SNP and paired interacted SNPs,each genetic effects of QTS are estimated and tested by the MCMC method based on a QTS full model.Simulations showed that the multi-trait GWAS method could provide increased power in detecting pleiotropic loci affecting more than one trait,and can unbiasedly estimate effects of QTS.To demonstrate the performance of the proposed method,we analyzed four blood lipid traits in Multi-Ethnic Study of Atherosclerosis(MESA)Cohort and two yield-related traits in a rice immortalized F2 dataset.A software package was developed for the proposed method.
基金supported by funds from the National Natural Science Foundation of China(No.42076100)the Joint Funds of the National Natural Science Foundation of China(No.U2006214).
文摘Biodiversity declines have motivated many studies on the relationship between species diversity and ecosystem functioning.In this study,we described the spatial-temporal characteristics of demersal fish communities along a coastal habitat in Rongcheng Bay,Shandong Peninsula,China with both species-based and biological trait-based approaches.The field survey was carried out monthly using traps from April to October of 2018,and divided into three seasons(spring:April and May;summer:June,July and August;autumn:September,October and November).The study area included five distinct habitats:seagrass bed,natural rocky reef,bare sand,artificial reef together with natural rocky reef,and artificial reef together with bare sand.We analyzed the fish communities with three taxonomic diversity indices,including Shannon-Wiener,Simpson,and Pielou Evenness,as well as four functional diversity indices,including FRic,FEve,FDiv,and FDis,based on 7 functional groups which are categorized into 27 traits.The results showed that there were no significant differences in taxonomic diversity indices among different habitats in the three seasons.However,significant differences were found in the functional richness of fish communities among different habitats in three seasons.Seagrass bed represented the highest functional richness in spring and autumn.This study demonstrates that seagrass bed is very important in enhancing the functional diversity of fish communities in a complex habitat.The study also indicates that the combination of taxonomic diversity and functional diversity will provide a more detailed description of the characteristics of fish communities.
基金the National Basic Research Program of China (2006CB 101700) Program for New Century Excellent Talents in University, Ministry of Education of China (NCET-05-0502) the Natural Science Foundation of Jiangsu Province (BK2006066)
文摘Complex traits are the features whose properties are determined by multiple factors, which can be genetic or environmental. Most of economically important characteristics of plants and animals belong to this special catego-
基金supported by the National Basic Research Program (973) of China (No. 2004CB117306)the Hi-Tech Research and Devel-opment Program (863) of China (No. 2006AA10A102)
文摘A method was proposed for the detection of outliers and influential observations in the framework of a mixed linear model, prior to the quantitative trait locus (QTL) mapping analysis. We investigated the impact of outliers on QTL mapping for complex traits in a mouse BXD population, and observed that the dropping of outliers could provide the evidence of additional QTL and epistatic loci affecting the 1stBrain-OB and the 2ndBrain-OB in a cross of the abovementioned population. The results could also reveal a remarkable increase in estimating heritabilities of QTL in the absence of outliers. In addition, simulations were conducted to investigate the detection powers and false discovery rates (FDRs) of QTLs in the presence and absence of outliers. The results suggested that the presence of a small proportion of outliers could increase the FDR and hence decrease the detection power of QTLs. A drastic increase could be obtained in the estimates of standard errors for position, additive and additive× environment interaction effects of QTLs in the presence of outliers.
基金the National Basic Research Program of China (2006CB 101700) the National Natural Science Foundation of China (30370758)+1 种基金 Program for New Century Excellent Talents in University, Ministry of Education of China (NCET-05-0502) the Natural Science Foundation of Jiangsu Province of China to Xu Chenwu (BK2006066).
文摘Complex traits are the features whose properties are determined by both genetic and environmental factors. Generally, complex traits include the classical quantitative traits with continuous distribution, the binary or categorical traits with discrete distribution controlled by polygene and other traits that cannot be measured exactly, such as behavior and psychology. Most human complex diseases and most economically important traits in plants and animals belong to the category. Understanding the molecular basis of complex traits plays a vital role in the genetic improvement of plant and animal breeding. In this article, the conception and research background of complex traits were summarized, and the strategies, methods and the great progress that had been made in dissecting genetic basis of complex traits were reviewed. The challenges and possible developments in future researches were also discussed.
基金Hendrech and Eiran Gotwert FundWellcome, Grant/Award Number: 085906/Z/08/Z, 075491/Z/04 and 090532/Z/09/Z+6 种基金Tel-Aviv UniversityIsraeli Science foundation, Grant/Award Number: 429/09, 961/15 and 1085/18Binational Science Foundation, Grant/Award Number: 2015077German Israeli Science Foundation, Grant/Award Number: I-63-410.20-2017Israeli Cancer Research FundCancer Research Counsel-UK Cancer Biology Research Center
文摘The Collaborative Cross(CC)mouse model is a next‐generation mouse genetic reference population(GRP)designated for a high‐resolution quantitative trait loci(QTL)mapping of complex traits during health and disease.The CC lines were generated from reciprocal crosses of eight divergent mouse founder strains composed of five classical and three wild‐derived strains.Complex traits are defined to be controlled by variations within multiple genes and the gene/environment interactions.In this article,we introduce and present variety of protocols and results of studying the host response to infectious and chronic diseases,including type 2 diabetes and metabolic diseases,body composition,immune response,colorectal cancer,susceptibility to Aspergillus fumigatus,Klebsiella pneumoniae,Pseudomonas aeruginosa,sepsis,and mixed infections of Porphyromonas gingivalis and Fusobacterium nucleatum,which were conducted at our laboratory using the CC mouse population.These traits are observed at multiple levels of the body systems,including metabolism,body weight,immune profile,susceptibility or resistance to the development and progress of infectious or chronic diseases.Herein,we present full protocols and step‐by‐step methods,implemented in our laboratory for the phenotypic and genotypic characterization of the different CC lines,mapping the gene underlying the host response to these infections and chronic diseases.The CC mouse model is a unique and powerful GRP for dissecting the host genetic architectures underlying complex traits,including chronic and infectious diseases.
基金funded by the Natural Science Foundation of China(31961133002,31901553,and 31771879)the National Key Research and Development Program of China(2020YFE0202300)+3 种基金the Science and Technology Major Program of Hubei Province(2021ABA011)the Swedish Research Council for Environment,Agricultural Sciences,and Spatial Planning(2019-01600)the Key Science and Technology Project of the China National Tobacco Corporation(110202101040 JY-17)the Jilin Scientific and Technological Development Program(20190201290JC).
文摘Phenotypic plasticity is the ability of a given genotype to produce multiple phenotypes in response to changing environmental conditions.Understanding the genetic basis of phenotypic plasticity and establishing a predictive model is highly relevant to future agriculture under a changing climate.Here we report findings on the genetic basis of phenotypic plasticity for 23 complex traits using a diverse maize population planted at five sites with distinct environmental conditions.We found that latituderelated environmental factors were the main drivers of across-site variation in flowering time traits but not in plant architecture or yield traits.For the 23 traits,we detected 109 quantitative trait loci(QTLs),29 for mean values,66 for plasticity,and 14 for both parameters,and 80%of the QTLs interacted with latitude.The effects of several QTLs changed in magnitude or sign,driving variation in phenotypic plasticity.We experimentally validated one plastic gene,ZmTPS14.1,whose effect was likely mediated by the compensation effect of ZmSPL6 from a downstream pathway.By integrating genetic diversity,environmental variation,and their interaction into a joint model,we could provide site-specific predictions with increased accuracy by as much as 9.9%,2.2%,and 2.6%for days to tassel,plant height,and ear weight,respectively.This study revealed a complex genetic architecture involving multiple alleles,pleiotropy,and genotype-byenvironment interaction that underlies variation in the mean and plasticity of maize complex traits.It provides novel insights into the dynamic genetic architecture of agronomic traits in response to changing environments,paving a practical way toward precision agriculture.
基金Funding XS was in receipt of a National Natural Science Foundation of China(NSFC)grant(No.12171495)a Natural Science Foundation of Guangdong Province grant(No.2114050001435)+3 种基金a National Key Research and Development Program grant(No.2022YFF1202105)Swedish Research Council(Vetenskapsraet)grants(No.2017-02543&No.2022-01309)supported by the Swedish Research Council grant(No.2017-02543)XS The Swedish National Infrastructure for Computing(SNIC)utilized was partially funded by the Swedish Research Council through grant agreement No.2018-05973.
文摘Alternative splicing exists in most multi-exonic genes,and exploring these complex alternative splicing events and their resultant isoform expressions is essential.However,it has become conventional that RNA sequencing results have often been summarized into gene-level expression counts mainly due to the multiple ambiguous mapping of reads at highly similar regions.Transcript-level quantification and interpretation are often overlooked,and biological interpretations are often deduced based on combined transcript information at the gene level.Here,for the most variable tissue of alternative splicing,the brain,we estimate isoform expressions in 1,191 samples collected by the Genotype-Tissue Expression(GTEx)Consortium using a powerful method that we previously developed.We perform genome-wide association scans on the isoform ratios per gene and identify isoform-ratio quantitative trait loci(irQTL),which could not be detected by studying gene-level expressions alone.By analyzing the genetic architecture of the irQTL,we show that isoform ratios regulate edu-cational attainment via multiple tissues including the frontal cortex(BA9),cortex,cervical spinal cord,and hippocampus.These tissues are also associated with different neuro-related traits,including Alzheimer’s or dementia,mood swings,sleep duration,alcohol intake,intelligence,anxiety or depression,etc.Mendelian randomization(MR)analysis revealed 1,139 pairs of isoforms and neuro-related traits with plausible causal relationships,showing much stronger causal effects than on general diseases measured in the UK Biobank(UKB).Our results highlight essential transcript-level biomarkers in the human brain for neuro-related complex traits and diseases,which could be missed by merely investigating overall gene expressions.
文摘Many rice-growing areas are affected by high concentrations of arsenic(As).Rice varieties that prevent As uptake and/or accumulation can mitigate As threats to human health.Genomic selection is known to facilitate rapid selection of superior genotypes for complex traits.We explored the predictive ability(PA)of genomic prediction with single-environment models,accounting or not for trait-specific markers,multi-environment models,and multi-trait and multi-environment models,using the genotypic(1600K SNPs)and phenotypic(grain As content,grain yield and days to flowering)data of the Bengal and Assam Aus Panel.Under the base-line single-environment model,PA of up to 0.707 and 0.654 was obtained for grain yield and grain As content,respectively;the three prediction methods(Bayesian Lasso,genomic best linear unbiased prediction and reproducing kernel Hilbert spaces)were considered to perform similarly,and marker selection based on linkage disequilibrium allowed to reduce the number of SNP to 17K,without negative effect on PA of genomic predictions.Single-environment models giving distinct weight to trait-specific markers in the genomic relationship matrix outperformed the base-line models up to 32%.Multi-environment models,accounting for genotype×environment interactions,and multi-trait and multi-environment models outperformed the base-line models by up to 47%and 61%,respectively.Among the multi-trait and multi-environment models,the Bayesian multi-output regressor stacking function obtained the highest predictive ability(0.831 for grain As)with much higher efficiency for computing time.These findings pave the way for breeding for As-tolerance in the progenies of biparental crosses involving members of the Bengal and Assam Aus Panel.Genomic prediction can also be applied to breeding for other complex traits under multiple environments.
文摘全基因组关联分析(genome-wide association study,GWAS)是定位基因组中与性状显著关联的变异位点的有效方法。随着表型记录的完善、高通量基因型分型技术的发展,以及统计方法的改进,全基因组关联分析在人类疾病、动物植物遗传等领域得到了广泛的应用。假阳性是影响全基因组关联分析结果可靠性的重要因素之一。为了控制假阳性,除了校正P值,GWAS模型从最简单的方差分析(或用于质量性状的卡方检验)到加入固定效应协变量的普通线性模型(general linear model,GLM),再到加入随机效应的混合线性模型(mixed linear model,MLM)持续改进,控制了多种混杂因素导致的假阳性。将个体的遗传效应拟合为由基因组亲缘关系矩阵(genomic relationships matrix,GRM)定义的随机效应是目前常用的方法。由于MLM的参数估计大量消耗计算资源,研究人员不断尝试模型求解优化和GRM的构建优化(GRM的构建优化同时也提高了计算效率),最终将基于MLM计算的时间复杂度由O(MN3)逐步改进到O(MN),实现了计算速度与统计功效的飞跃。针对质量性状病例对照比失衡带来的假阳性问题,研究人员进一步对广义混合线性模型(generalized linear mixed model,GLMM)进行了校正。本文较全面地介绍了GWAS的基本原理和发展,着重阐述了GWAS中MLM模型的改进和优化细节,同时,列举了GWAS在农业中的应用,包括在植物、动物和微生物方面的研究成果,以及基于单倍型的GWAS应用。最后,从进一步提高GWAS统计功效和GWAS试验设计2个角度对GWAS未来的发展进行了展望。
基金supported by the National Basic Research Program of China(2011CB109306)the"111"Project of China(B06014)+2 种基金the CNTC(110200701023)the YNTC(08A05)Microsoft Research Asia
文摘The Quantitative Genetic Analysis Station (QGAStation) is a software package that has been developed to perform statistical analysis for complex traits.It consists of five domains for handling data from diallel crosses,regional trials,core germplasm collections,QTL mapping,and microarray experiments.The first domain contains genetic models for diallel cross analysis,in which genetic variance components and genetic-by-environment interactions can be estimated,and genetic effects can be predicted.The second domain evaluates the performance of varieties in regional trials by implementing a general statistical method that outperforms ANOVA in tackling unbalanced data that arises frequently in trials across multiple locations and over a number of years.The third domain,using predicted genotypic values as proxy,constructs core germplasm collections covering sufficient genetic diversity with lower redundancy.The fourth domain manages genotypic and phenotypic data for QTL mapping.Linkage maps can be constructed and genetic distances can be estimated;the statistical methods that have been implemented apply to both chiasmatic and achiasmatic organisms.Another part of this domain can filter systematic noises in phenotypic data.The fifth domain focuses on the cDNA expression data that is generated by microarray experiments.A two-step strategy has been implemented to detect differentially expressed genes and to estimate their effects.Except in the fourth domain,the major statistical methods that have been used are mixed linear model approaches that have been implemented in the C language.Computational efficiency is further boosted for computers that are equipped with graphics processing units (GPUs).A user friendly graphic interface is provided for Microsoft Windows and Apple Mac operating systems.QGAStation is available at http://ibi.zju.edu.cn/software/qga/.
基金supported by the National Basic Research Program of China(2011CB109306)the National High Technology Research and Development Program of China(2009ZX08009-004B,2011AA10A102)+2 种基金the CNTC(110200701023)the YNTC(08A05)the earmearked fund for Modern Agro-industry Technology Reasearch System(CARS-18-05)
文摘A promising way to uncover the genetic architectures underlying complex traits may lie in the ability to recognize the genetic variants and expression transcripts that are responsible for the traits' inheritance.However,statistical methods capable of investigating the association between the inheritance of a quantitative trait and expression transcripts are still limited.In this study,we described a two-step approach that we developed to evaluate the contribution of expression transcripts to the inheritance of a complex trait.First,a mixed linear model approach was applied to detect significant trait-associated differentially expressed transcripts.Then,conditional analysis were used to predict the contribution of the differentially expressed genes to a target trait.Diallel cross data of cotton was used to test the application of the approach.We proposed that the detected differentially expressed transcripts with a strong impact on the target trait could be used as intermediates for screening lines to improve the traits in plant and animal breeding programs.It can benefit the discovery of the genetic mechanisms underlying complex traits.
基金supported by the National Basic Research Program of China(2011CB109306and2010CB126006)the National Special Program for Breeding New Transgenic Variety(2009ZX08009-004B)the CNTC(110200701023)and the YNTC(08A05)
文摘Most of the important agronomic traits in crops,such as yield and quality,are complex traits affected by multiple genes with gene × gene interaction as well as gene × environment interaction.Understanding the genetic architecture of complex traits is a long-term task for quantitative geneticists and plant breeders who wish to design efficient breeding programs.Conventionally,the genetic properties of traits can be revealed by partitioning the total variation into variation components caused by specific genetic effects.With recent advances in molecular genotyping and high-throughput technology,the unraveling of the genetic architecture of complex traits by analyzing quantitative trait locus (QTL) has become possible.The improvement of complex traits has also been achieved by pyramiding individual QTL.In this review,we describe some statistical methods for QTL mapping that can be used to analyze QTL × QTL interaction and QTL × environment interaction,and discuss their applications in crop breeding for complex traits.
基金supported by the National Natural Science Foundation of China(81171880)the National Basic Research Program of China(2011CB51001 to S.Huang)the GeNeSys Consortium(to O.Goldmann and E.Medina
文摘It has long been assumed that most parts of a genome and most genetic variations or SNPs are non-functional with regard to reproductive fitness.However,the collective effects of SNPs have yet to be examined by experimental science.We here developed a novel approach to examine the relationship between traits and the total amount of SNPs in panels of genetic reference populations.We identified the minor alleles(MAs)in each panel and the MA content(MAC)that each inbred strain carried for a set of SNPs with genotypes determined in these panels.MAC was nearly linearly linked to quantitative variations in numerous traits in model organisms,including life span,tumor susceptibility,learning and memory,sensitivity to alcohol and anti-psychotic drugs,and two correlated traits poor reproductive fitness and strong immunity.These results suggest that the collective effects of SNPs are functional and do affect reproductive fitness.
基金supported by the National Basic Research Program of China(2011CB100106)the National Natural Science Foundation of China(30971846and31171187)+2 种基金the Vital Project of Natural Science of Universities in Jiangsu Province(09KJA210002) to C.Xuthe National Natural Science Foundation of China(31100882) to Z.TangNational Natural Science Foundation of China(31000539) to J.Xiao
文摘Chromosome segment substitution lines have been created in several experimental models,including many plant and animal species,and are useful tools for the genetic analysis and mapping of complex traits.The traditional t-test is usually applied to identify a quantitative trait locus (QTL) that is contained within a chromosome segment to estimate the QTL's effect.However,current methods cannot uncover the entire genetic structure of complex traits.For example,current methods cannot distinguish between main effects and epistatic effects.In this paper,a linear epistatic model was constructed to dissect complex traits.First,all the long substituted segments were divided into overlapping small bins,and each small bin was considered a unique independent variable.The genetic model for complex traits was then constructed.When considering all the possible main effects and epistatic effects,the dimensions of the linear model can become extremely high.Therefore,variable selection via stepwise regression (Bin-REG) was proposed for the epistatic QTL analysis in the present study.Furthermore,we tested the feasibility of using the LASSO (least absolute shrinkage and selection operator) algorithm to estimate epistatic effects,examined the fully Bayesian SSVS (stochastic search variable selection) approach,tested the empirical Bayes (E-BAYES) method,and evaluated the penalized likelihood (PENAL) method for mapping epistatic QTLs.Simulation studies suggested that all of the above methods,excluding the LASSO and PENAL approaches,performed satisfactorily.The Bin-REG method appears to outperform all other methods in terms of estimating positions and effects.