The coronavirus disease 2019(COVID-19)pandemic has greatly damaged human society,but the origins and early transmission patterns of the severe acute respiratory syndrome coronavirus 2(SARS-CoV-2)pathogen remain unclea...The coronavirus disease 2019(COVID-19)pandemic has greatly damaged human society,but the origins and early transmission patterns of the severe acute respiratory syndrome coronavirus 2(SARS-CoV-2)pathogen remain unclear.Here,we reconstructed the transmission networks of SARS-CoV-2 during the first three and six months since its first report based on ancestor-offspring relationships using BANAL-52-referenced mutations.We explored the position(i.e.,root,middle,or tip)of early detected samples in the evolutionary tree of SARS-CoV-2.In total,6799 transmission chains and 1766 transmission networks were reconstructed,with chain lengths ranging from 1-9 nodes.The root node samples of the 1766 transmission networks were from 58 countries or regions and showed no common ancestor,indicating the occurrence of many independent or parallel transmissions of SARS-CoV-2 when first detected(i.e.,all samples were located at the tip position of the evolutionary tree).No root node sample was found in any sample(n=31,all from the Chinese mainland)collected in the first 15 days from 24 December 2019.Results using six-month data or RaTG13-referenced mutation data were similar.The reconstruction method was verified using a simulation approach.Our results suggest that SARS-CoV-2 may have already been spreading independently worldwide before the outbreak of COVID-19 in Wuhan,China.Thus,a comprehensive global survey of human and animal samples is essential to explore the origins of SARS-CoV-2 and its natural reservoirs and hosts.展开更多
Cornelia de Lange syndrome (CdLS; OMIM: 122470) is characterized by distinctive facial features, growthretardation, hirsutism, and upper limb reduction defects. Craniofacial features manifest as synophrys, arched e...Cornelia de Lange syndrome (CdLS; OMIM: 122470) is characterized by distinctive facial features, growthretardation, hirsutism, and upper limb reduction defects. Craniofacial features manifest as synophrys, arched eyebrows, long thick eyelashes, a small upturned nose, small widely-spaced teeth, and microcephaly. The intelligence quotient (IQ) is usually below the normal level. More phenotypes are frequently found, such as cardiac septal defects, gastrointestinal dysfunction, hearing loss, myopia, and cryptorchidism or hypoplastic genitalia.展开更多
Primary biliary cholangitis(PBC) is an autoimmune disease involving dysregulation of a broad array of homeostatic and metabolic processes. Although considerable single-nucleotide polymorphisms have been unveiled, a la...Primary biliary cholangitis(PBC) is an autoimmune disease involving dysregulation of a broad array of homeostatic and metabolic processes. Although considerable single-nucleotide polymorphisms have been unveiled, a large fraction of risk factors remains enigmatic. Candidate genes with rare mutations that tend to confer more deleterious effects need to be identified. To help pinpoint cellular and developmental mechanisms beyond common noncoding variants, we integrate whole exome sequencing with integrative network analysis to investigate genes harboring de novo mutations. Prominent convergence has been revealed on a network of disease-specific co-expression comprised of 55 genes associated with homeostasis and metabolism. The transcription factor gene MEF2 D and the DNA repair gene PARP2 are highlighted as hub genes and identified to be up-and down-regulated, respectively, in peripheral blood data set. Enrichment analysis demonstrates that altered expression of MEF2 D and PARP2 may trigger a series of molecular and cellular processes with pivotal roles in PBC pathophysiology. Our study identifies genes with de novo mutations in PBC and suggests that a subset of genes in homeostasis and metabolism tend to act in synergy through converging on co-expression network, providing novel insights into the etiology of PBC and expanding the pool of molecular candidates for discovering clinically actionable biomarkers.展开更多
The mutation rate used in the previous analyses of pig evolution and demographics was cursory and hence invited potential bias in inferring evolutionary history.Herein,we estimated the de novo mutation rate of pigs as...The mutation rate used in the previous analyses of pig evolution and demographics was cursory and hence invited potential bias in inferring evolutionary history.Herein,we estimated the de novo mutation rate of pigs as 3.6×10-9 per base per generation using high-quality whole-genome sequencing data from nine individuals in a three-generation pedigree through stringent filtering and validation.Using this mutation rate,we re-investigated the evolutionary history of pigs.The estimated divergence time of~10 kiloyears ago(KYA)between European wild and domesticated pigs was consistent with the domestication time of European pigs based on archaeological evidence.However,other divergence events inferred here were not as ancient as previously described.Our estimates suggest that Sus speciation occurred~1.36 million years ago(MYA);European wild pigs split from Asian wild pigs only~219 KYA;and south and north Chinese wild pigs split~25 KYA.Meanwhile,our results showed that the most recent divergence event between Chinese wild and domesticated pigs occurred in the Hetao Plain,northern China,approximately 20 KYA,supporting the possibly independent domestication in northern China along the middle Yellow River.We also found that the maximum effective population size of pigs was~6 times larger than estimated before.An archaic migration from other Sus species originating~2 MYA to European pigs was detected during western colonization of pigs,which may affect the accuracy of previous demographic inference.Our de novo mutation rate estimation and its consequences for demographic history inference reasonably provide a new vision regarding the evolutionary history of pigs.展开更多
Congenital heart disease(CHD)is observed in up to 1%of live births and is one of the leading causes of mortality from birth defects.While hundreds of genes have been implicated in the genetic etiology of CHD,their rol...Congenital heart disease(CHD)is observed in up to 1%of live births and is one of the leading causes of mortality from birth defects.While hundreds of genes have been implicated in the genetic etiology of CHD,their role in CHD pathogenesis is still poorly understood.This is largely a reflection of the sporadic nature of CHD,as well as its variable expressivity and incomplete penetrance.We reviewed the monogenic causes and evidence for oligogenic etiology of CHD,as well as the role of de novo mutations,common variants,and genetic modifiers.For further mechanistic insight,we leveraged single-cell data across species to investigate the cellular expression characteristics of genes implicated in CHD in developing human and mouse embryonic hearts.Understanding the genetic etiology of CHD may enable the application of precision medicine and prenatal diagnosis,thereby facilitating early intervention to improve outcomes for patients with CHD.展开更多
Vogt–Koyanagi–Harada(VKH)disease is a leading cause of blindness in young and middle-aged people.However,the etiology of VKH disease remains unclear.Here,we performed the first trio-based whole-exome sequencing stud...Vogt–Koyanagi–Harada(VKH)disease is a leading cause of blindness in young and middle-aged people.However,the etiology of VKH disease remains unclear.Here,we performed the first trio-based whole-exome sequencing study,which enrolled 25 VKH patients and 50 controls,followed by a study of 2081 VKH patients from a Han Chinese population to uncover detrimental mutations.A total of 15 de novo mutations in VKH patients were identified,with one of the most important being the membrane palmitoylated protein 2(MPP2)p.K315N(MPP2-N315)mutation.The MPP2-N315 mutation was highly deleterious according to bioinformatic predictions.Additionally,this mutation appears rare,being absent from the 1000 Genome Project and Genome Aggregation Database,and it is highly conserved in 10 species,including humans and mice.Subsequent studies showed that pathological phenotypes and retinal vascular leakage were aggravated in MPP2-N315 mutation knock-in or MPP2-N315 adeno-associated virus-treated mice with experimental autoimmune uveitis(EAU).In vitro,we used clustered regularly interspaced short palindromic repeats(CRISPR‒Cas9)gene editing technology to delete intrinsic MPP2 before overexpressing wild-type MPP2 or MPP2-N315.Levels of cytokines,such as IL-1β,IL-17E,and vascular endothelial growth factor A,were increased,and barrier function was destroyed in the MPP2-N315 mutant ARPE19 cells.Mechanistically,the MPP2-N315 mutation had a stronger ability to directly bind to ANXA2 than MPP2-K315,as shown by LC‒MS/MS and Co-IP,and resulted in activation of the ERK3/IL-17E pathway.Overall,our results demonstrated that the MPP2-K315N mutation may increase susceptibility to VKH disease.展开更多
INTRODUCTION Hypoparathyroidism, sensorineural deafness, and renal dysplasia (HDR) syndrome, also called Barakat syndrome, is an autosomal dominant genetic disease caused by haploinsufficiency of the GATA-binding pr...INTRODUCTION Hypoparathyroidism, sensorineural deafness, and renal dysplasia (HDR) syndrome, also called Barakat syndrome, is an autosomal dominant genetic disease caused by haploinsufficiency of the GATA-binding protein 3 (GATA3) gene located on the 10pl 5 chromosome.展开更多
De novo variants(DNVs)are one of the most significant contributors to severe earlyonset genetic disorders such as autism spectrum disorder,intellectual disability,and other developmental and neuropsychiatric(DNP)disor...De novo variants(DNVs)are one of the most significant contributors to severe earlyonset genetic disorders such as autism spectrum disorder,intellectual disability,and other developmental and neuropsychiatric(DNP)disorders.Presently,a plethora of DNVs have been identified using next-generation sequencing,and many efforts have been made to understand their impact at the gene level.However,there has been little exploration of the effects at the isoform level.The brain contains a high level of alternative splicing and regulation,and exhibits a more divergent splicing program than other tissues.Therefore,it is crucial to explore variants at the transcriptional regulation level to better interpret the mechanisms underlying DNP disorders.To facilitate a better usage and improve the isoform-level interpretation of variants,we developed NeuroPsychiatric Mutation Knowledge Base(PsyMuKB).It contains a comprehensive,carefully curated list of DNVs with transcriptional and translational annotations to enable identification of isoformspecific mutations.PsyMuKB allows a flexible search of genes or variants and provides both table-based descriptions and associated visualizations,such as expression,transcript genomic structures,protein interactions,and the mutation sites mapped on the protein structures.It also provides an easy-to-use web interface,allowing users to rapidly visualize the locations and characteristics of mutations and the expression patterns of the impacted genes and isoforms.PsyMuKB thus constitutes a valuable resource for identifying tissue-specific DNVs for further functional studies of related disorders.PsyMuKB is freely accessible at http://psymukb.net.展开更多
Background:Whole-exome sequencing(WES)studies have identified multiple genes enriched for de novo mutations(DNMs)in congenital heart disease(CHD)probands.However,risk gene identification based on DNMs alone remains st...Background:Whole-exome sequencing(WES)studies have identified multiple genes enriched for de novo mutations(DNMs)in congenital heart disease(CHD)probands.However,risk gene identification based on DNMs alone remains statistically challenging due to heterogenous etiology of CHD and low mutation rate in each gene.Methods:In this manuscript,we introduce a hierarchical Bayesian framework for gene-level association test which jointly analyzes de novo and rare transmitted variants.Through integrative modeling of multiple types of genetic variants,gene-level annotations,and reference data from large population cohorts,our method accurately characterizes the expected frequencies of both de novo and transmitted variants and shows improved statistical power compared to analyses based on DNMs only.Results:Applied to WES data of 2,645 CHD proband-parent trios,our method identified 15 significant genes,half of which are novel,leading to new insights into the genetic bases of CHD.Conclusion:These results showcase the power of integrative analysis of transmitted and de novo variants for disease gene discovery.展开更多
Autism spectrum disorder (ASD) is a neurodevelopmental disorder with considerable clinical and genetic heterogeneity.In this study,we identified all classes of genomic variants from whole-genome sequencing (WGS) datas...Autism spectrum disorder (ASD) is a neurodevelopmental disorder with considerable clinical and genetic heterogeneity.In this study,we identified all classes of genomic variants from whole-genome sequencing (WGS) dataset of 32 Chinese trios with ASD,including de novo mutations,inherited variants,copy number variants (CNVs) and genomic structural variants.A higher mutation rate (Poisson test,P<2.2×10) in exonic (1.37×10) and 3’-UTR regions (1.42×10) was revealed in comparison with that of whole genome (1.05×10).Using an integrated model,we identified 87 potentially risk genes (P<0.01) from 4832 genes harboring various rare deleterious variants,including CHD8 and NRXN2,implying that the disorders may be in favor to multiple-hit.In particular,frequent rare inherited mutations of several microcephaly-associated genes (ASPM,WDR62,and ZNF335)were found in ASD.In chromosomal structure analyses,we found four de novo CNVs and one de novo chromosomal rearrangement event,including a de novo duplication of UBE3A-containing region at 15q11.2-q13.1,which causes Angelman syndrome and microcephaly,and a disrupted TNR due to de novo chromosomal translocation t (1;5) (q25.1;q33.2).Taken together,our results suggest that abnormalities of centrosomal function and chromatin remodeling of the microcephaly-associated genes may be implicated in pathogenesis of ASD.Adoption of WGS as a new yet efficient technique to illustrate the full genetic spectrum in complex disorders,such as ASD,could provide novel insights into pathogenesis,diagnosis and treatment.展开更多
The capacity of RNA viruses to adapt to new hosts and rapidly escape the host immune system is largely attributable to de novo genetic diversity that emerges through mutations in RNA.Although the molecular spectrum of...The capacity of RNA viruses to adapt to new hosts and rapidly escape the host immune system is largely attributable to de novo genetic diversity that emerges through mutations in RNA.Although the molecular spectrum of de novo mutations—the relative rates at which various base substitutions occur—are widely recognized as informative toward understanding the evolution of a viral genome,little attention has been paid to the possibility of using molecular spectra to infer the host origins of a virus.Here,we characterize the molecular spectrum of de novo mutations for SARS-CoV-2 from transcriptomic data obtained from virus-infected cell lines,enabled by the use of sporadic junctions formed during discontinuous transcription as molecular barcodes.We find that de novo mutations are generated in a replication-independent manner,typically on the genomic strand,and highly dependent on mutagenic mechanisms specific to the host cellular environment.De novo mutations will then strongly influence the types of base substitutions accumulated during SARS-CoV-2 evolution,in an asymmetric manner favoring specific mutation types.Consequently,similarities between the mutation spectra of SARS-CoV-2 and the bat coronavirus RaTG13,which have accumulated since their divergence strongly suggest that SARS-CoV-2 evolved in a host cellular environment highly similar to that of bats before its zoonotic transfer into humans.Collectively,our findings provide data-driven support for the natural origin of SARS-CoV-2.展开更多
Neurodevelopmental disorders(NDDs)are a set of complex disorders characterized by diverse and cooccurring clinical symptoms.The genetic contribution in patients with NDDs remains largely unknown.Here,we sequence 519 N...Neurodevelopmental disorders(NDDs)are a set of complex disorders characterized by diverse and cooccurring clinical symptoms.The genetic contribution in patients with NDDs remains largely unknown.Here,we sequence 519 NDD-related genes in 3,195 Chinese probands with neurodevelopmental phenotypes and identify 2,522 putative functional mutations consisting of 137 de novo mutations(DNMs)in 86 genes and 2,385 rare inherited mutations(RIMs)with 22 X-linked hemizygotes in 13 genes,2 homozygous mutations in 2 genes and 23 compound heterozygous mutations in 10 genes.Furthermore,the DNMs of16,807 probands with NDDs are retrieved from public datasets and combine in an integrated analysis with the mutation data of our Chinese NDD probands by taking 3,582 in-house controls of Chinese origin as background.We prioritize 26 novel candidate genes.Notably,six of these genes d ITSN1,UBR3,CADM1,RYR3,FLNA,and PLXNA3 d preferably contribute to autism spectrum disorders(ASDs),as demonstrated by high co-expression and/or interaction with ASD genes confirmed via rescue experiments in a mouse model.Importantly,these genes are differentially expressed in the ASD cortex in a significant manner and involved in ASD-associated networks.Together,our study expands the genetic spectrum of Chinese NDDs,further facilitating both basic and translational research.展开更多
The major histocompatibility complex(MHC)is closely associated with numerous diseases,but its high degree of polymorphism complicates the discovery of disease-associated variants.In principle,recombination and de novo...The major histocompatibility complex(MHC)is closely associated with numerous diseases,but its high degree of polymorphism complicates the discovery of disease-associated variants.In principle,recombination and de novo mutations are two critical factors responsible for MHC polymorphisms.However,direct evidence for this hypothesis is lacking.Here,we report the generation of fine-scale MHC recombination and de novo mutation maps of~5 Mb by deep sequencing(>100×)of the MHC genome for 17 MHC recombination and 30 non-recombination Han Chinese families(a total of 190 individuals).Recombination hotspots and Han-specific breakpoints are located in close proximity at haplotype block boundaries.The average MHC de novo mutation rate is higher than the genome-wide de novo mutation rate,particularly in MHC recombinant individuals.Notably,mutation and recombination generated polymorphisms are located within and outside linkage disequilibrium regions of the MHC,respectively,and evolution of the MHC locus was mainly controlled by positive selection.These findings provide insights on the evolutionary causes of the MHC diversity and may facilitate the identification of disease-associated genetic variants.展开更多
Over the past two years,scientists throughout the world have completed more than 6 million SARS-CoV-2 genome sequences.Today,the number of SARS-CoV-2 genomes exceeds the total number of all other viral genomes.These g...Over the past two years,scientists throughout the world have completed more than 6 million SARS-CoV-2 genome sequences.Today,the number of SARS-CoV-2 genomes exceeds the total number of all other viral genomes.These genomes are a record of the evolution of SARS-CoV-2 in the human host,and provide information on the emergence of mutations.In this study,analysis of these sequenced genomes identified 296,728 de novo mutations(DNMs),and found that six types of base substitutions reached saturation in the sequenced genome population.Based on this analysis,a“mutation blacklist”of SARS-CoV-2 was compiled.The loci on the“mutation blacklist”are highly conserved,and these mutations likely have detrimental effects on virus survival,replication,and transmission.This information is valuable for SARS-CoV-2 research on gene function,vaccine design,and drug development.Through association analysis of DNMs and viral transmission rates,we identified 185 DNMs that positively correlated with the SARS-CoV-2 transmission rate,and these DNMs where classified as the“mutation whitelist”of SARS-CoV-2.The mutations on the“mutation whitelist”are beneficial for SARS-CoV-2 transmission and could therefore be used to evaluate the transmissibility of new variants.The occurrence of mutations and the evolution of viruses are dynamic processes.To more effectively monitor the mutations and variants of SARS-CoV-2,we built a SARS-CoV-2 mutation and variant monitoring and pre-warning system(MVMPS),which can monitor the occurrence and development of mutations and variants of SARSCoV-2,as well as provide pre-warning for the prevention and control of SARS-CoV-2(https://www.omicx.cn/).Additionally,this system could be used in real-time to update the“mutation whitelist”and“mutation blacklist”of SARS-CoV-2.展开更多
The severe acute respiratory syndrome coronavirus 2(SARS‐CoV‐2)pandemic resulted in significant societal costs.Hence,an in‐depth understanding of SARS‐CoV‐2 virus mutation and its evolution will help determine th...The severe acute respiratory syndrome coronavirus 2(SARS‐CoV‐2)pandemic resulted in significant societal costs.Hence,an in‐depth understanding of SARS‐CoV‐2 virus mutation and its evolution will help determine the direction of the COVID‐19 pandemic.In this study,we identified 296,728 de novo mutations in more than 2,800,000 high‐quality SARS‐CoV‐2 genomes.All possible factors affecting the mutation frequency of SARS‐CoV‐2 in human hosts were analyzed,including zinc finger antiviral proteins,sequence context,amino acid change,and translation efficiency.As a result,we proposed that when adenine(A)and tyrosine(T)bases are in the context of AM(M stands for adenine or cytosine)or TA motif,A or T base has lower mutation frequency.Furthermore,we hypothesized that translation efficiency can affect the mutation frequency of the third position of the codon by the selection,which explains why SARS‐CoV‐2 prefers AT3 codons usage.In addition,we found a host‐specific asymmetric dinucleotide mutation frequency in the SARS‐CoV‐2 genome,which provides a new basis for determining the origin of the SARS‐CoV‐2.Finally,we summarize all possible factors affecting mutation frequency and provide insights into the mutation characteristics and evolutionary trends of SARS‐CoV‐2.展开更多
Schizophrenia is a common disorder with a high heritability, but its genetic architecture is still elusive.We implemented whole-genome sequencing(WGS) analysis of 8 families with monozygotic(MZ) twin pairs discordant ...Schizophrenia is a common disorder with a high heritability, but its genetic architecture is still elusive.We implemented whole-genome sequencing(WGS) analysis of 8 families with monozygotic(MZ) twin pairs discordant for schizophrenia to assess potential association of de novo mutations(DNMs) or inherited variants with susceptibility to schizophrenia. Eight non-synonymous DNMs(including one splicing site) were identified and shared by twins, which were either located in previously reported schizophrenia risk genes(p.V24689 I mutation in TTN, p.S2506 T mutation in GCN1L1, IVS3+1G > T in DOCK1) or had a benign to damaging effect according to in silico prediction analysis. By searching the inherited rare damaging or loss-of-function(LOF) variants and common susceptible alleles from three classes of schizophrenia candidate genes, we were able to distill genetic alterations in several schizophrenia risk genes, including GAD1, PLXNA2, RELN and FEZ1. Four inherited copy number variations(CNVs; including a large deletion at 16p13.11) implicated for schizophrenia were identified in four families, respectively. Most of families carried both missense DNMs and inherited risk variants, which might suggest that DNMs, inherited rare damaging variants and common risk alleles together conferred to schizophrenia susceptibility. Our results support that schizophrenia is caused by a combination of multiple genetic factors, with each DNM/variant showing a relatively small effect size.展开更多
Schizophrenia(SCZ) is a complex and heterogeneous mental disorder that affects about 1% of global population. In recent years,considerable progress has been made in genetic studies of SCZ. A number of common variant...Schizophrenia(SCZ) is a complex and heterogeneous mental disorder that affects about 1% of global population. In recent years,considerable progress has been made in genetic studies of SCZ. A number of common variants with small effects and rare variants with relatively larger effects have been identifi ed. These variants include risk loci identifi ed by genome-wide association studies,rare copy-number variants identifi ed by comparative genomic analyses,and de novo mutations identified by high-throughput DNA sequencing. Collectively,they contribute to the heterogeneity of the disease. In this review,we update recent discoveries in the fi eld of SCZ genetics,and outline the perspectives of future directions.展开更多
基金supported by the Ministry of Science and Technology of the People’s Republic of China(2021YFC0863400)Institute of Zoology,Chinese Academy of Sciences(E0517111,E122G611)。
文摘The coronavirus disease 2019(COVID-19)pandemic has greatly damaged human society,but the origins and early transmission patterns of the severe acute respiratory syndrome coronavirus 2(SARS-CoV-2)pathogen remain unclear.Here,we reconstructed the transmission networks of SARS-CoV-2 during the first three and six months since its first report based on ancestor-offspring relationships using BANAL-52-referenced mutations.We explored the position(i.e.,root,middle,or tip)of early detected samples in the evolutionary tree of SARS-CoV-2.In total,6799 transmission chains and 1766 transmission networks were reconstructed,with chain lengths ranging from 1-9 nodes.The root node samples of the 1766 transmission networks were from 58 countries or regions and showed no common ancestor,indicating the occurrence of many independent or parallel transmissions of SARS-CoV-2 when first detected(i.e.,all samples were located at the tip position of the evolutionary tree).No root node sample was found in any sample(n=31,all from the Chinese mainland)collected in the first 15 days from 24 December 2019.Results using six-month data or RaTG13-referenced mutation data were similar.The reconstruction method was verified using a simulation approach.Our results suggest that SARS-CoV-2 may have already been spreading independently worldwide before the outbreak of COVID-19 in Wuhan,China.Thus,a comprehensive global survey of human and animal samples is essential to explore the origins of SARS-CoV-2 and its natural reservoirs and hosts.
基金This work was supported by the grants from the Natural Science Foundation of Zhejiang Province (No. Y2090108), the Qianjiang Talent Program (No. 2010R10063) and the Scientific Research Foundation for the Return Overseas Chinese Scholars, State Education Ministry.
文摘Cornelia de Lange syndrome (CdLS; OMIM: 122470) is characterized by distinctive facial features, growthretardation, hirsutism, and upper limb reduction defects. Craniofacial features manifest as synophrys, arched eyebrows, long thick eyelashes, a small upturned nose, small widely-spaced teeth, and microcephaly. The intelligence quotient (IQ) is usually below the normal level. More phenotypes are frequently found, such as cardiac septal defects, gastrointestinal dysfunction, hearing loss, myopia, and cryptorchidism or hypoplastic genitalia.
基金supported in part by grants from the National Natural Science Foundation of China (81870397 to X.D.L.81620108002, 81771732, 81830016 to X.M+2 种基金and 81570469 to R.Q.T.)by grants from Jiangsu provincial research fund (BE2017713 to X.D.L and BL2018657 to Y.T.)a grant from National Key R&D Program of China (2016YFC0900400)。
文摘Primary biliary cholangitis(PBC) is an autoimmune disease involving dysregulation of a broad array of homeostatic and metabolic processes. Although considerable single-nucleotide polymorphisms have been unveiled, a large fraction of risk factors remains enigmatic. Candidate genes with rare mutations that tend to confer more deleterious effects need to be identified. To help pinpoint cellular and developmental mechanisms beyond common noncoding variants, we integrate whole exome sequencing with integrative network analysis to investigate genes harboring de novo mutations. Prominent convergence has been revealed on a network of disease-specific co-expression comprised of 55 genes associated with homeostasis and metabolism. The transcription factor gene MEF2 D and the DNA repair gene PARP2 are highlighted as hub genes and identified to be up-and down-regulated, respectively, in peripheral blood data set. Enrichment analysis demonstrates that altered expression of MEF2 D and PARP2 may trigger a series of molecular and cellular processes with pivotal roles in PBC pathophysiology. Our study identifies genes with de novo mutations in PBC and suggests that a subset of genes in homeostasis and metabolism tend to act in synergy through converging on co-expression network, providing novel insights into the etiology of PBC and expanding the pool of molecular candidates for discovering clinically actionable biomarkers.
基金This work was financially supported by the Innovative Research Team of the Ministry of Education of China(Grant No.IRT1136)the National Natural Science Foundation of China(Grant No.31672383)the National Swine Industry and Technology System of China(Grant No.nycytx-009).
文摘The mutation rate used in the previous analyses of pig evolution and demographics was cursory and hence invited potential bias in inferring evolutionary history.Herein,we estimated the de novo mutation rate of pigs as 3.6×10-9 per base per generation using high-quality whole-genome sequencing data from nine individuals in a three-generation pedigree through stringent filtering and validation.Using this mutation rate,we re-investigated the evolutionary history of pigs.The estimated divergence time of~10 kiloyears ago(KYA)between European wild and domesticated pigs was consistent with the domestication time of European pigs based on archaeological evidence.However,other divergence events inferred here were not as ancient as previously described.Our estimates suggest that Sus speciation occurred~1.36 million years ago(MYA);European wild pigs split from Asian wild pigs only~219 KYA;and south and north Chinese wild pigs split~25 KYA.Meanwhile,our results showed that the most recent divergence event between Chinese wild and domesticated pigs occurred in the Hetao Plain,northern China,approximately 20 KYA,supporting the possibly independent domestication in northern China along the middle Yellow River.We also found that the maximum effective population size of pigs was~6 times larger than estimated before.An archaic migration from other Sus species originating~2 MYA to European pigs was detected during western colonization of pigs,which may affect the accuracy of previous demographic inference.Our de novo mutation rate estimation and its consequences for demographic history inference reasonably provide a new vision regarding the evolutionary history of pigs.
文摘Congenital heart disease(CHD)is observed in up to 1%of live births and is one of the leading causes of mortality from birth defects.While hundreds of genes have been implicated in the genetic etiology of CHD,their role in CHD pathogenesis is still poorly understood.This is largely a reflection of the sporadic nature of CHD,as well as its variable expressivity and incomplete penetrance.We reviewed the monogenic causes and evidence for oligogenic etiology of CHD,as well as the role of de novo mutations,common variants,and genetic modifiers.For further mechanistic insight,we leveraged single-cell data across species to investigate the cellular expression characteristics of genes implicated in CHD in developing human and mouse embryonic hearts.Understanding the genetic etiology of CHD may enable the application of precision medicine and prenatal diagnosis,thereby facilitating early intervention to improve outcomes for patients with CHD.
基金We thank the families for participation in this study,and we thank Novogene Technology Co.,Ltd.,for the WES sequencing and analysis.This work was supported by the National Natural Science Foundation Project of China(82070951,82271078)the National Natural Science Foundation Key Program(81930023)+3 种基金The Innovative Research Group Project of Chongqing Education Commission(CXQT19015)the Innovation Supporting Plan of Overseas Study of Chongqing(cx2018010)the National Key Clinical Specialties Construction Program of China,the Chongqing Branch of the National Clinical Research Center for Ocular Diseases,the Chongqing Key Laboratory of Ophthalmology(CSTC,2008CA5003)the Program for Youth Innovation in Future Medicine,Chongqing Medical University(w0047).
文摘Vogt–Koyanagi–Harada(VKH)disease is a leading cause of blindness in young and middle-aged people.However,the etiology of VKH disease remains unclear.Here,we performed the first trio-based whole-exome sequencing study,which enrolled 25 VKH patients and 50 controls,followed by a study of 2081 VKH patients from a Han Chinese population to uncover detrimental mutations.A total of 15 de novo mutations in VKH patients were identified,with one of the most important being the membrane palmitoylated protein 2(MPP2)p.K315N(MPP2-N315)mutation.The MPP2-N315 mutation was highly deleterious according to bioinformatic predictions.Additionally,this mutation appears rare,being absent from the 1000 Genome Project and Genome Aggregation Database,and it is highly conserved in 10 species,including humans and mice.Subsequent studies showed that pathological phenotypes and retinal vascular leakage were aggravated in MPP2-N315 mutation knock-in or MPP2-N315 adeno-associated virus-treated mice with experimental autoimmune uveitis(EAU).In vitro,we used clustered regularly interspaced short palindromic repeats(CRISPR‒Cas9)gene editing technology to delete intrinsic MPP2 before overexpressing wild-type MPP2 or MPP2-N315.Levels of cytokines,such as IL-1β,IL-17E,and vascular endothelial growth factor A,were increased,and barrier function was destroyed in the MPP2-N315 mutant ARPE19 cells.Mechanistically,the MPP2-N315 mutation had a stronger ability to directly bind to ANXA2 than MPP2-K315,as shown by LC‒MS/MS and Co-IP,and resulted in activation of the ERK3/IL-17E pathway.Overall,our results demonstrated that the MPP2-K315N mutation may increase susceptibility to VKH disease.
文摘INTRODUCTION Hypoparathyroidism, sensorineural deafness, and renal dysplasia (HDR) syndrome, also called Barakat syndrome, is an autosomal dominant genetic disease caused by haploinsufficiency of the GATA-binding protein 3 (GATA3) gene located on the 10pl 5 chromosome.
基金supported by grants from the National Key R&D Program of China(Grant No.2017YFC0909200)the National Natural Science Foundation of China(Grant Nos.81671328 and 61802057)+3 种基金Program for Professor of Special Appointment(Eastern Scholar)at Shanghai Institutions of Higher Learning(Grant No.1610000043)Innovation Research Plan supported by Shanghai Municipal Education Commission(Grant No.ZXWF082101)Science and Technology Development Plan of Jilin Province(Grant Nos.20180414006GH and 20180520028JH)the Fundamental Research Funds for the Central Universities
文摘De novo variants(DNVs)are one of the most significant contributors to severe earlyonset genetic disorders such as autism spectrum disorder,intellectual disability,and other developmental and neuropsychiatric(DNP)disorders.Presently,a plethora of DNVs have been identified using next-generation sequencing,and many efforts have been made to understand their impact at the gene level.However,there has been little exploration of the effects at the isoform level.The brain contains a high level of alternative splicing and regulation,and exhibits a more divergent splicing program than other tissues.Therefore,it is crucial to explore variants at the transcriptional regulation level to better interpret the mechanisms underlying DNP disorders.To facilitate a better usage and improve the isoform-level interpretation of variants,we developed NeuroPsychiatric Mutation Knowledge Base(PsyMuKB).It contains a comprehensive,carefully curated list of DNVs with transcriptional and translational annotations to enable identification of isoformspecific mutations.PsyMuKB allows a flexible search of genes or variants and provides both table-based descriptions and associated visualizations,such as expression,transcript genomic structures,protein interactions,and the mutation sites mapped on the protein structures.It also provides an easy-to-use web interface,allowing users to rapidly visualize the locations and characteristics of mutations and the expression patterns of the impacted genes and isoforms.PsyMuKB thus constitutes a valuable resource for identifying tissue-specific DNVs for further functional studies of related disorders.PsyMuKB is freely accessible at http://psymukb.net.
基金the National Institutes of Health(NIH)grants R01 GM134005,and the National Science Foundation(NSF)grants DMS 1902903.Dr.Sheng Chih Jin's effort was supported by the Pathway to Independence Award(K99/R00)program,grants K99HL143036-01A1 and R00HL143036-02.
文摘Background:Whole-exome sequencing(WES)studies have identified multiple genes enriched for de novo mutations(DNMs)in congenital heart disease(CHD)probands.However,risk gene identification based on DNMs alone remains statistically challenging due to heterogenous etiology of CHD and low mutation rate in each gene.Methods:In this manuscript,we introduce a hierarchical Bayesian framework for gene-level association test which jointly analyzes de novo and rare transmitted variants.Through integrative modeling of multiple types of genetic variants,gene-level annotations,and reference data from large population cohorts,our method accurately characterizes the expected frequencies of both de novo and transmitted variants and shows improved statistical power compared to analyses based on DNMs only.Results:Applied to WES data of 2,645 CHD proband-parent trios,our method identified 15 significant genes,half of which are novel,leading to new insights into the genetic bases of CHD.Conclusion:These results showcase the power of integrative analysis of transmitted and de novo variants for disease gene discovery.
基金supported by the grants from the Major State Basic Research Development Program of China(2012CB517902 and 2012CB517904)National Key Technology Research and Development Program of China(2012BAI03B00)+3 种基金Special Research Program of National Health and Family Planning Commission of China(201302002)International S&T Cooperation Program of China(2011DFA30670)National Natural Science Foundation of China(31571357/31771404)supported in part by research funding from AstraZeneca Innovation Center China and Wenzhou Medical University
文摘Autism spectrum disorder (ASD) is a neurodevelopmental disorder with considerable clinical and genetic heterogeneity.In this study,we identified all classes of genomic variants from whole-genome sequencing (WGS) dataset of 32 Chinese trios with ASD,including de novo mutations,inherited variants,copy number variants (CNVs) and genomic structural variants.A higher mutation rate (Poisson test,P<2.2×10) in exonic (1.37×10) and 3’-UTR regions (1.42×10) was revealed in comparison with that of whole genome (1.05×10).Using an integrated model,we identified 87 potentially risk genes (P<0.01) from 4832 genes harboring various rare deleterious variants,including CHD8 and NRXN2,implying that the disorders may be in favor to multiple-hit.In particular,frequent rare inherited mutations of several microcephaly-associated genes (ASPM,WDR62,and ZNF335)were found in ASD.In chromosomal structure analyses,we found four de novo CNVs and one de novo chromosomal rearrangement event,including a de novo duplication of UBE3A-containing region at 15q11.2-q13.1,which causes Angelman syndrome and microcephaly,and a disrupted TNR due to de novo chromosomal translocation t (1;5) (q25.1;q33.2).Taken together,our results suggest that abnormalities of centrosomal function and chromatin remodeling of the microcephaly-associated genes may be implicated in pathogenesis of ASD.Adoption of WGS as a new yet efficient technique to illustrate the full genetic spectrum in complex disorders,such as ASD,could provide novel insights into pathogenesis,diagnosis and treatment.
基金supported by grants from the National Natural Science Foundation of China(31922014).
文摘The capacity of RNA viruses to adapt to new hosts and rapidly escape the host immune system is largely attributable to de novo genetic diversity that emerges through mutations in RNA.Although the molecular spectrum of de novo mutations—the relative rates at which various base substitutions occur—are widely recognized as informative toward understanding the evolution of a viral genome,little attention has been paid to the possibility of using molecular spectra to infer the host origins of a virus.Here,we characterize the molecular spectrum of de novo mutations for SARS-CoV-2 from transcriptomic data obtained from virus-infected cell lines,enabled by the use of sporadic junctions formed during discontinuous transcription as molecular barcodes.We find that de novo mutations are generated in a replication-independent manner,typically on the genomic strand,and highly dependent on mutagenic mechanisms specific to the host cellular environment.De novo mutations will then strongly influence the types of base substitutions accumulated during SARS-CoV-2 evolution,in an asymmetric manner favoring specific mutation types.Consequently,similarities between the mutation spectra of SARS-CoV-2 and the bat coronavirus RaTG13,which have accumulated since their divergence strongly suggest that SARS-CoV-2 evolved in a host cellular environment highly similar to that of bats before its zoonotic transfer into humans.Collectively,our findings provide data-driven support for the natural origin of SARS-CoV-2.
基金supported by the Guangdong Key Project in“Development of new tools for diagnosis and treatment of Autism”(2018B030335001 to Z.Sun)and“Early diagnosis and treatment of autism spectrum disorders”(202007030002 to Z.Sun)the National Natural Science Foundation of China(32070590 to Y.Wang)+5 种基金the National Natural Science Foundation of China(81730036 and81525007 to K.Xia)Science and Technology Major Project of Hunan Provincial Science and Technology Department(2018SK1030 to K.Xia)the National Natural Science Foundation of China(81801133 to J.Li)the Young Elite Scientist Sponsorship Program by CAST(2018QNRC001 to J.Li)the Innovation-Driven Project of Central South University(20180033040004 to J.Li)Natural Science Foundation of Hunan Province for outstanding Young Scholars(2020JJ3059 to J.Li)。
文摘Neurodevelopmental disorders(NDDs)are a set of complex disorders characterized by diverse and cooccurring clinical symptoms.The genetic contribution in patients with NDDs remains largely unknown.Here,we sequence 519 NDD-related genes in 3,195 Chinese probands with neurodevelopmental phenotypes and identify 2,522 putative functional mutations consisting of 137 de novo mutations(DNMs)in 86 genes and 2,385 rare inherited mutations(RIMs)with 22 X-linked hemizygotes in 13 genes,2 homozygous mutations in 2 genes and 23 compound heterozygous mutations in 10 genes.Furthermore,the DNMs of16,807 probands with NDDs are retrieved from public datasets and combine in an integrated analysis with the mutation data of our Chinese NDD probands by taking 3,582 in-house controls of Chinese origin as background.We prioritize 26 novel candidate genes.Notably,six of these genes d ITSN1,UBR3,CADM1,RYR3,FLNA,and PLXNA3 d preferably contribute to autism spectrum disorders(ASDs),as demonstrated by high co-expression and/or interaction with ASD genes confirmed via rescue experiments in a mouse model.Importantly,these genes are differentially expressed in the ASD cortex in a significant manner and involved in ASD-associated networks.Together,our study expands the genetic spectrum of Chinese NDDs,further facilitating both basic and translational research.
基金supported by grants from the National Key Basic Research Development Program of China(grants No.2009CB522401 and 2003CB515509,and AWS14C014)。
文摘The major histocompatibility complex(MHC)is closely associated with numerous diseases,but its high degree of polymorphism complicates the discovery of disease-associated variants.In principle,recombination and de novo mutations are two critical factors responsible for MHC polymorphisms.However,direct evidence for this hypothesis is lacking.Here,we report the generation of fine-scale MHC recombination and de novo mutation maps of~5 Mb by deep sequencing(>100×)of the MHC genome for 17 MHC recombination and 30 non-recombination Han Chinese families(a total of 190 individuals).Recombination hotspots and Han-specific breakpoints are located in close proximity at haplotype block boundaries.The average MHC de novo mutation rate is higher than the genome-wide de novo mutation rate,particularly in MHC recombinant individuals.Notably,mutation and recombination generated polymorphisms are located within and outside linkage disequilibrium regions of the MHC,respectively,and evolution of the MHC locus was mainly controlled by positive selection.These findings provide insights on the evolutionary causes of the MHC diversity and may facilitate the identification of disease-associated genetic variants.
基金This study was supported by funding from the Foundation of the Committee on Science and Technology of Tianjin(19YFZCSN00080)the State Key Research and Development Plan(2019YFC1605004)the National Key Programs for Infectious Diseases of China(2017ZX10303405-001).
文摘Over the past two years,scientists throughout the world have completed more than 6 million SARS-CoV-2 genome sequences.Today,the number of SARS-CoV-2 genomes exceeds the total number of all other viral genomes.These genomes are a record of the evolution of SARS-CoV-2 in the human host,and provide information on the emergence of mutations.In this study,analysis of these sequenced genomes identified 296,728 de novo mutations(DNMs),and found that six types of base substitutions reached saturation in the sequenced genome population.Based on this analysis,a“mutation blacklist”of SARS-CoV-2 was compiled.The loci on the“mutation blacklist”are highly conserved,and these mutations likely have detrimental effects on virus survival,replication,and transmission.This information is valuable for SARS-CoV-2 research on gene function,vaccine design,and drug development.Through association analysis of DNMs and viral transmission rates,we identified 185 DNMs that positively correlated with the SARS-CoV-2 transmission rate,and these DNMs where classified as the“mutation whitelist”of SARS-CoV-2.The mutations on the“mutation whitelist”are beneficial for SARS-CoV-2 transmission and could therefore be used to evaluate the transmissibility of new variants.The occurrence of mutations and the evolution of viruses are dynamic processes.To more effectively monitor the mutations and variants of SARS-CoV-2,we built a SARS-CoV-2 mutation and variant monitoring and pre-warning system(MVMPS),which can monitor the occurrence and development of mutations and variants of SARSCoV-2,as well as provide pre-warning for the prevention and control of SARS-CoV-2(https://www.omicx.cn/).Additionally,this system could be used in real-time to update the“mutation whitelist”and“mutation blacklist”of SARS-CoV-2.
基金This study was supported by funding from the Foundation of the Committee on Science and Technology of Tianjin(19YFZCSN00080)the State Key Research and Development Plan(2019YFC1605004)the National Key Programs for Infectious Diseases of China(2017ZX10303405‐001).
文摘The severe acute respiratory syndrome coronavirus 2(SARS‐CoV‐2)pandemic resulted in significant societal costs.Hence,an in‐depth understanding of SARS‐CoV‐2 virus mutation and its evolution will help determine the direction of the COVID‐19 pandemic.In this study,we identified 296,728 de novo mutations in more than 2,800,000 high‐quality SARS‐CoV‐2 genomes.All possible factors affecting the mutation frequency of SARS‐CoV‐2 in human hosts were analyzed,including zinc finger antiviral proteins,sequence context,amino acid change,and translation efficiency.As a result,we proposed that when adenine(A)and tyrosine(T)bases are in the context of AM(M stands for adenine or cytosine)or TA motif,A or T base has lower mutation frequency.Furthermore,we hypothesized that translation efficiency can affect the mutation frequency of the third position of the codon by the selection,which explains why SARS‐CoV‐2 prefers AT3 codons usage.In addition,we found a host‐specific asymmetric dinucleotide mutation frequency in the SARS‐CoV‐2 genome,which provides a new basis for determining the origin of the SARS‐CoV‐2.Finally,we summarize all possible factors affecting mutation frequency and provide insights into the mutation characteristics and evolutionary trends of SARS‐CoV‐2.
基金supported by the Strategic Priority Research Program (B) of the Chinese Academy of Sciences (XDB02020003 and XDB02030002)the Bureau of Frontier Sciences and Education,Chinese Academy of Sciences (QYZDJ-SSW-SMC005)+3 种基金the National Natural Science Foundation of China (Nos. 81088001,81271484,81471361 and 81371480)the Beijing Training Project for the Leading Talents in S & T (Z151100000315020)the National Key Basic Research and Development Program (973) (2012CB517904)the CAS/SAFEA International Partnership Programme for Creative Research Teams (Y2CX131003)
文摘Schizophrenia is a common disorder with a high heritability, but its genetic architecture is still elusive.We implemented whole-genome sequencing(WGS) analysis of 8 families with monozygotic(MZ) twin pairs discordant for schizophrenia to assess potential association of de novo mutations(DNMs) or inherited variants with susceptibility to schizophrenia. Eight non-synonymous DNMs(including one splicing site) were identified and shared by twins, which were either located in previously reported schizophrenia risk genes(p.V24689 I mutation in TTN, p.S2506 T mutation in GCN1L1, IVS3+1G > T in DOCK1) or had a benign to damaging effect according to in silico prediction analysis. By searching the inherited rare damaging or loss-of-function(LOF) variants and common susceptible alleles from three classes of schizophrenia candidate genes, we were able to distill genetic alterations in several schizophrenia risk genes, including GAD1, PLXNA2, RELN and FEZ1. Four inherited copy number variations(CNVs; including a large deletion at 16p13.11) implicated for schizophrenia were identified in four families, respectively. Most of families carried both missense DNMs and inherited risk variants, which might suggest that DNMs, inherited rare damaging variants and common risk alleles together conferred to schizophrenia susceptibility. Our results support that schizophrenia is caused by a combination of multiple genetic factors, with each DNM/variant showing a relatively small effect size.
基金supported by the National Institutes of Health,USA (MH101054)
文摘Schizophrenia(SCZ) is a complex and heterogeneous mental disorder that affects about 1% of global population. In recent years,considerable progress has been made in genetic studies of SCZ. A number of common variants with small effects and rare variants with relatively larger effects have been identifi ed. These variants include risk loci identifi ed by genome-wide association studies,rare copy-number variants identifi ed by comparative genomic analyses,and de novo mutations identified by high-throughput DNA sequencing. Collectively,they contribute to the heterogeneity of the disease. In this review,we update recent discoveries in the fi eld of SCZ genetics,and outline the perspectives of future directions.