Common variants explain little of the variance of most common disease, prompting large-scale sequencing studies to understand the contribution of rare variants to these diseases. Imputation of rare variants from genom...Common variants explain little of the variance of most common disease, prompting large-scale sequencing studies to understand the contribution of rare variants to these diseases. Imputation of rare variants from genome-wide genotypic arrays offers a cost-efficient strategy to achieve necessary sample sizes required for adequate statistical power. To estimate the performance of imputation of rare variants, we imputed 153 individuals, each of whom was genotyped on 3 different genotype arrays including 317k, 610k and 1 million single nucleotide polymorphisms (SNPs), to two different reference panels: HapMap2 and 1000 Genomes pilot March 2010 release (1KGpilot) by using IMPUTE version 2. We found that more than 94% and 84% of all SNPs yield acceptable accuracy (info 〉 0.4) in HapMap2 and 1KGpilot-based imputation, respectively. For rare variants (minor allele frequency (MAF) 〈5%), the proportion of well- imputed SNPs increased as the MAF increased from 0.3% to 5% across all 3 genome-wide association study (GWAS) datasets. The proportion of well-imputed SNPs was 69%, 60% and 49% for SNPs with a MAF from 0.3% to 5% for 1M, 610k and 317k, respectively. None of the very rare variants (MAF 〈 0.3%) were well imputed. We conclude that the imputation accuracy of rare variants increases with higher density of genome-wide genotyping arrays when the size of the reference panel is small. Variants with lower MAF are more difficult to impute. These findings have important implications in the design and replication of large-scale sequencing studies.展开更多
Diabetes mellitus is a complicated disease characterized by a complex interplay of genetic,epigenetic,and environmental variables.It is one of the world's fastestgrowing diseases,with 783 million adults expected t...Diabetes mellitus is a complicated disease characterized by a complex interplay of genetic,epigenetic,and environmental variables.It is one of the world's fastestgrowing diseases,with 783 million adults expected to be affected by 2045.Devastating macrovascular consequences(cerebrovascular disease,cardiovascular disease,and peripheral vascular disease)and microvascular complications(like retinopathy,nephropathy,and neuropathy)increase mortality,blindness,kidney failure,and overall quality of life in individuals with diabetes.Clinical risk factors and glycemic management alone cannot predict the development of vascular problems;multiple genetic investigations have revealed a clear hereditary component to both diabetes and its related complications.In the twenty-first century,technological advancements(genome-wide association studies,nextgeneration sequencing,and exome-sequencing)have led to the identification of genetic variants associated with diabetes,however,these variants can only explain a small proportion of the total heritability of the condition.In this review,we address some of the likely explanations for this"missing heritability",for diabetes such as the significance of uncommon variants,gene-environment interactions,and epigenetics.Current discoveries clinical value,management of diabetes,and future research directions are also discussed.展开更多
SNCA,GBA,and VPS35 are three common genes associated with Parkinson's disease.Previous studies have shown that these three genes may be associated with Alzheimer's disease(AD).However,it is unclear whether the...SNCA,GBA,and VPS35 are three common genes associated with Parkinson's disease.Previous studies have shown that these three genes may be associated with Alzheimer's disease(AD).However,it is unclear whether these genes increase the risk of AD in Chinese populations.In this study,we used a targeted gene sequencing panel to screen all the exon regions and the nearby sequences of GBA,SNCA,and VPS35 in a cohort including 721 AD patients and 365 healthy controls from China.The results revealed that neither common variants nor rare variants of these three genes were associated with AD in a Chinese population.These findings suggest that the mutations in GBA,SNCA,and VPS35 are not likely to play an important role in the genetic susceptibility to AD in Chinese populations.The study was approved by the Ethics Committee of Xiangya Hospital,Central South University,China on March 9,2016(approval No.201603198).展开更多
Background:Whole-exome sequencing(WES)studies have identified multiple genes enriched for de novo mutations(DNMs)in congenital heart disease(CHD)probands.However,risk gene identification based on DNMs alone remains st...Background:Whole-exome sequencing(WES)studies have identified multiple genes enriched for de novo mutations(DNMs)in congenital heart disease(CHD)probands.However,risk gene identification based on DNMs alone remains statistically challenging due to heterogenous etiology of CHD and low mutation rate in each gene.Methods:In this manuscript,we introduce a hierarchical Bayesian framework for gene-level association test which jointly analyzes de novo and rare transmitted variants.Through integrative modeling of multiple types of genetic variants,gene-level annotations,and reference data from large population cohorts,our method accurately characterizes the expected frequencies of both de novo and transmitted variants and shows improved statistical power compared to analyses based on DNMs only.Results:Applied to WES data of 2,645 CHD proband-parent trios,our method identified 15 significant genes,half of which are novel,leading to new insights into the genetic bases of CHD.Conclusion:These results showcase the power of integrative analysis of transmitted and de novo variants for disease gene discovery.展开更多
Rare genetic variants are abundant in genomes but less tractable in genome-wide association study. Here we exploit a strategy of rare variation mapping to discover a gene essential for tendril development in cucumber ...Rare genetic variants are abundant in genomes but less tractable in genome-wide association study. Here we exploit a strategy of rare variation mapping to discover a gene essential for tendril development in cucumber (Cucumis sativus L.). In a collection of 〉3000 lines, we discovered a unique tendril-less line that forms branches instead of tendrils and, therefore, loses its climbing ability. We hypothesized that this unusual phenotype was caused by a rare variation and subsequently identified the causative single nucleotide poly- morphism. The affected gene TEN encodes a TCP transcription factor conserved within the cucurbits and is expressed specifically in tendrils, representing a new organ identity gene. The variation occurs within a pro- tein motif unique to the cucurbits and impairs its function as a transcriptional activator. Analyses of transcrip- tomes from near-isogenic lines identified downstream genes required for the tendril's capability to sense and climb a support. This study provides an example to explore rare functional variants in plant genomes.展开更多
Human height is a highly heritable trait in which multiple genes are involved. Recent genome-wide association studies (GWASs) have identified that COL11A1 is an important susceptibility gene for hu- man height. To d...Human height is a highly heritable trait in which multiple genes are involved. Recent genome-wide association studies (GWASs) have identified that COL11A1 is an important susceptibility gene for hu- man height. To determine whether the variants of COL11A 1 are associated with adult and children height, we analyzed splicing and coding single-nucleotide variants across COL11A1 through exome-targeted sequencing and two validation stages with a total 20,426 Chinese Han samples. A total of 105 variants were identified by exome-targeted sequencing, of which 30 SNPs were located in coding region. The strongest association signal was chrl 103380393 with P value of 4.8 × 10-7. Chrl_103380393 also showed nominal significance in the validation stage (P = 1.21×10 6). Combined analysis of 16,738 samples strengthened the original association of chrl 103380393 with adult height (Pcombinea - 3.1×10 8), with an increased height of 0.292sd (standard deviation) per G allele (95% CI: 0.19-0.40). There was no evidence (P = 0.843) showing that chrl 103380393 altered child height in 3688 child samples. Only the group of 12-15 years showed slight significance with P value of 0.0258. This study firstly shows that genetic variants of COL11A1 contribute to adult height in Chinese Han population but not to children height, which expand our knowledge of the genetic factors underlying height variation and the biological regulation of human height.展开更多
Background:Genome-wide association studies(GWAS)have been widely adopted in studies of human complex traits and diseases.Results:This review surveys areas of active research:quantifying and partitioning trait heritabi...Background:Genome-wide association studies(GWAS)have been widely adopted in studies of human complex traits and diseases.Results:This review surveys areas of active research:quantifying and partitioning trait heritability,fine mapping functional variants and integrative analysis,genetic risk prediction of phenotypes,and the analysis of sequencing studies that have identified millions of rare variants.Current challenges and opportunities are highlighted.Conclusion:GWAS have fundamentally transformed the field of human complex trait genetics.Novel statistical and computational methods have expanded the scope of GWAS and have provided valuable insights on the genetic architecture underlying complex phenotypes.展开更多
Neurodevelopmental disorders(NDDs)are a set of complex disorders characterized by diverse and cooccurring clinical symptoms.The genetic contribution in patients with NDDs remains largely unknown.Here,we sequence 519 N...Neurodevelopmental disorders(NDDs)are a set of complex disorders characterized by diverse and cooccurring clinical symptoms.The genetic contribution in patients with NDDs remains largely unknown.Here,we sequence 519 NDD-related genes in 3,195 Chinese probands with neurodevelopmental phenotypes and identify 2,522 putative functional mutations consisting of 137 de novo mutations(DNMs)in 86 genes and 2,385 rare inherited mutations(RIMs)with 22 X-linked hemizygotes in 13 genes,2 homozygous mutations in 2 genes and 23 compound heterozygous mutations in 10 genes.Furthermore,the DNMs of16,807 probands with NDDs are retrieved from public datasets and combine in an integrated analysis with the mutation data of our Chinese NDD probands by taking 3,582 in-house controls of Chinese origin as background.We prioritize 26 novel candidate genes.Notably,six of these genes d ITSN1,UBR3,CADM1,RYR3,FLNA,and PLXNA3 d preferably contribute to autism spectrum disorders(ASDs),as demonstrated by high co-expression and/or interaction with ASD genes confirmed via rescue experiments in a mouse model.Importantly,these genes are differentially expressed in the ASD cortex in a significant manner and involved in ASD-associated networks.Together,our study expands the genetic spectrum of Chinese NDDs,further facilitating both basic and translational research.展开更多
文摘Common variants explain little of the variance of most common disease, prompting large-scale sequencing studies to understand the contribution of rare variants to these diseases. Imputation of rare variants from genome-wide genotypic arrays offers a cost-efficient strategy to achieve necessary sample sizes required for adequate statistical power. To estimate the performance of imputation of rare variants, we imputed 153 individuals, each of whom was genotyped on 3 different genotype arrays including 317k, 610k and 1 million single nucleotide polymorphisms (SNPs), to two different reference panels: HapMap2 and 1000 Genomes pilot March 2010 release (1KGpilot) by using IMPUTE version 2. We found that more than 94% and 84% of all SNPs yield acceptable accuracy (info 〉 0.4) in HapMap2 and 1KGpilot-based imputation, respectively. For rare variants (minor allele frequency (MAF) 〈5%), the proportion of well- imputed SNPs increased as the MAF increased from 0.3% to 5% across all 3 genome-wide association study (GWAS) datasets. The proportion of well-imputed SNPs was 69%, 60% and 49% for SNPs with a MAF from 0.3% to 5% for 1M, 610k and 317k, respectively. None of the very rare variants (MAF 〈 0.3%) were well imputed. We conclude that the imputation accuracy of rare variants increases with higher density of genome-wide genotyping arrays when the size of the reference panel is small. Variants with lower MAF are more difficult to impute. These findings have important implications in the design and replication of large-scale sequencing studies.
文摘Diabetes mellitus is a complicated disease characterized by a complex interplay of genetic,epigenetic,and environmental variables.It is one of the world's fastestgrowing diseases,with 783 million adults expected to be affected by 2045.Devastating macrovascular consequences(cerebrovascular disease,cardiovascular disease,and peripheral vascular disease)and microvascular complications(like retinopathy,nephropathy,and neuropathy)increase mortality,blindness,kidney failure,and overall quality of life in individuals with diabetes.Clinical risk factors and glycemic management alone cannot predict the development of vascular problems;multiple genetic investigations have revealed a clear hereditary component to both diabetes and its related complications.In the twenty-first century,technological advancements(genome-wide association studies,nextgeneration sequencing,and exome-sequencing)have led to the identification of genetic variants associated with diabetes,however,these variants can only explain a small proportion of the total heritability of the condition.In this review,we address some of the likely explanations for this"missing heritability",for diabetes such as the significance of uncommon variants,gene-environment interactions,and epigenetics.Current discoveries clinical value,management of diabetes,and future research directions are also discussed.
基金supported by the National Natural Science Foundation of China,Nos.81971029 (to LS) and 82071216 (to BJ)。
文摘SNCA,GBA,and VPS35 are three common genes associated with Parkinson's disease.Previous studies have shown that these three genes may be associated with Alzheimer's disease(AD).However,it is unclear whether these genes increase the risk of AD in Chinese populations.In this study,we used a targeted gene sequencing panel to screen all the exon regions and the nearby sequences of GBA,SNCA,and VPS35 in a cohort including 721 AD patients and 365 healthy controls from China.The results revealed that neither common variants nor rare variants of these three genes were associated with AD in a Chinese population.These findings suggest that the mutations in GBA,SNCA,and VPS35 are not likely to play an important role in the genetic susceptibility to AD in Chinese populations.The study was approved by the Ethics Committee of Xiangya Hospital,Central South University,China on March 9,2016(approval No.201603198).
基金the National Institutes of Health(NIH)grants R01 GM134005,and the National Science Foundation(NSF)grants DMS 1902903.Dr.Sheng Chih Jin's effort was supported by the Pathway to Independence Award(K99/R00)program,grants K99HL143036-01A1 and R00HL143036-02.
文摘Background:Whole-exome sequencing(WES)studies have identified multiple genes enriched for de novo mutations(DNMs)in congenital heart disease(CHD)probands.However,risk gene identification based on DNMs alone remains statistically challenging due to heterogenous etiology of CHD and low mutation rate in each gene.Methods:In this manuscript,we introduce a hierarchical Bayesian framework for gene-level association test which jointly analyzes de novo and rare transmitted variants.Through integrative modeling of multiple types of genetic variants,gene-level annotations,and reference data from large population cohorts,our method accurately characterizes the expected frequencies of both de novo and transmitted variants and shows improved statistical power compared to analyses based on DNMs only.Results:Applied to WES data of 2,645 CHD proband-parent trios,our method identified 15 significant genes,half of which are novel,leading to new insights into the genetic bases of CHD.Conclusion:These results showcase the power of integrative analysis of transmitted and de novo variants for disease gene discovery.
文摘Rare genetic variants are abundant in genomes but less tractable in genome-wide association study. Here we exploit a strategy of rare variation mapping to discover a gene essential for tendril development in cucumber (Cucumis sativus L.). In a collection of 〉3000 lines, we discovered a unique tendril-less line that forms branches instead of tendrils and, therefore, loses its climbing ability. We hypothesized that this unusual phenotype was caused by a rare variation and subsequently identified the causative single nucleotide poly- morphism. The affected gene TEN encodes a TCP transcription factor conserved within the cucurbits and is expressed specifically in tendrils, representing a new organ identity gene. The variation occurs within a pro- tein motif unique to the cucurbits and impairs its function as a transcriptional activator. Analyses of transcrip- tomes from near-isogenic lines identified downstream genes required for the tendril's capability to sense and climb a support. This study provides an example to explore rare functional variants in plant genomes.
基金supported by the grant from the Youth National Science Foundation of China (No.31100908)
文摘Human height is a highly heritable trait in which multiple genes are involved. Recent genome-wide association studies (GWASs) have identified that COL11A1 is an important susceptibility gene for hu- man height. To determine whether the variants of COL11A 1 are associated with adult and children height, we analyzed splicing and coding single-nucleotide variants across COL11A1 through exome-targeted sequencing and two validation stages with a total 20,426 Chinese Han samples. A total of 105 variants were identified by exome-targeted sequencing, of which 30 SNPs were located in coding region. The strongest association signal was chrl 103380393 with P value of 4.8 × 10-7. Chrl_103380393 also showed nominal significance in the validation stage (P = 1.21×10 6). Combined analysis of 16,738 samples strengthened the original association of chrl 103380393 with adult height (Pcombinea - 3.1×10 8), with an increased height of 0.292sd (standard deviation) per G allele (95% CI: 0.19-0.40). There was no evidence (P = 0.843) showing that chrl 103380393 altered child height in 3688 child samples. Only the group of 12-15 years showed slight significance with P value of 0.0258. This study firstly shows that genetic variants of COL11A1 contribute to adult height in Chinese Han population but not to children height, which expand our knowledge of the genetic factors underlying height variation and the biological regulation of human height.
基金This work is supported by NIH R35GM127063(HT)and NIH AG066206(ZH).
文摘Background:Genome-wide association studies(GWAS)have been widely adopted in studies of human complex traits and diseases.Results:This review surveys areas of active research:quantifying and partitioning trait heritability,fine mapping functional variants and integrative analysis,genetic risk prediction of phenotypes,and the analysis of sequencing studies that have identified millions of rare variants.Current challenges and opportunities are highlighted.Conclusion:GWAS have fundamentally transformed the field of human complex trait genetics.Novel statistical and computational methods have expanded the scope of GWAS and have provided valuable insights on the genetic architecture underlying complex phenotypes.
基金supported by the Guangdong Key Project in“Development of new tools for diagnosis and treatment of Autism”(2018B030335001 to Z.Sun)and“Early diagnosis and treatment of autism spectrum disorders”(202007030002 to Z.Sun)the National Natural Science Foundation of China(32070590 to Y.Wang)+5 种基金the National Natural Science Foundation of China(81730036 and81525007 to K.Xia)Science and Technology Major Project of Hunan Provincial Science and Technology Department(2018SK1030 to K.Xia)the National Natural Science Foundation of China(81801133 to J.Li)the Young Elite Scientist Sponsorship Program by CAST(2018QNRC001 to J.Li)the Innovation-Driven Project of Central South University(20180033040004 to J.Li)Natural Science Foundation of Hunan Province for outstanding Young Scholars(2020JJ3059 to J.Li)。
文摘Neurodevelopmental disorders(NDDs)are a set of complex disorders characterized by diverse and cooccurring clinical symptoms.The genetic contribution in patients with NDDs remains largely unknown.Here,we sequence 519 NDD-related genes in 3,195 Chinese probands with neurodevelopmental phenotypes and identify 2,522 putative functional mutations consisting of 137 de novo mutations(DNMs)in 86 genes and 2,385 rare inherited mutations(RIMs)with 22 X-linked hemizygotes in 13 genes,2 homozygous mutations in 2 genes and 23 compound heterozygous mutations in 10 genes.Furthermore,the DNMs of16,807 probands with NDDs are retrieved from public datasets and combine in an integrated analysis with the mutation data of our Chinese NDD probands by taking 3,582 in-house controls of Chinese origin as background.We prioritize 26 novel candidate genes.Notably,six of these genes d ITSN1,UBR3,CADM1,RYR3,FLNA,and PLXNA3 d preferably contribute to autism spectrum disorders(ASDs),as demonstrated by high co-expression and/or interaction with ASD genes confirmed via rescue experiments in a mouse model.Importantly,these genes are differentially expressed in the ASD cortex in a significant manner and involved in ASD-associated networks.Together,our study expands the genetic spectrum of Chinese NDDs,further facilitating both basic and translational research.