"Synthetic"allopolyploids recreated by interspecific hybridization play an important role in providing novel genomic variation for crop improvement.Such synthetic allopolyploids often undergo rapid genomic s..."Synthetic"allopolyploids recreated by interspecific hybridization play an important role in providing novel genomic variation for crop improvement.Such synthetic allopolyploids often undergo rapid genomic structural variation(SV).However,how such SV arises,is inherited and fixed,and how it affects important traits,has rarely been comprehensively and quantitively studied in advanced generation synthetic lines.A better understanding of these processes will aid breeders in knowing how to best utilize synthetic allopolyploids in breeding programs.Here,we analyzed three genetic mapping populations(735 DH lines)derived from crosses between advanced synthetic and conventional Brassica napus(rapeseed)lines,using whole-genome sequencing to determine genome composition.We observed high tolerance of large structural variants,particularly toward the telomeres,and preferential selection for balanced homoeologous exchanges(duplication/deletion events between the A and C genomes resulting in retention of gene/chromosome dosage between homoeologous chromosome pairs),including stable events involving whole chromosomes("pseudoeuploidy").Given the experimental design(all three populations shared a common parent),we were able to observe that parental SV was regularly inherited,showed genetic hitchhiking effects on segregation,and was one of the major factors inducing adjacent novel and larger SV.Surprisingly,novel SV occurred at low frequencies with no significant impacts on observed fertility and yield-related traits in the advanced generation synthetic lines.However,incorporating genome-wide SV in linkage mapping explained significantly more genetic variance for traits.Our results provide a framework for detecting and understanding the occurrence and inheritance of genomic SV in breeding programs,and support the use of synthetic parents as an important source of novel trait variation.展开更多
Tea green leafhopper(TGL),Empoasca onukii,is of biological and economic interest.Despite numerous studies,the mechanisms underlying its adaptation and evolution remain enigmatic.Here,we use previously untapped genome ...Tea green leafhopper(TGL),Empoasca onukii,is of biological and economic interest.Despite numerous studies,the mechanisms underlying its adaptation and evolution remain enigmatic.Here,we use previously untapped genome and population genetics approaches to examine how the pest adapted to different environmental variables and thus has expanded geographically.We complete a chromosome-level assembly and annotation of the E.onukii genome,showing notable expansions of gene families associated with adaptation to chemoreception and detoxification.Genomic signals indicating balancing selection highlight metabolic pathways involved in adaptation to a wide range of tea varieties grown across ecologically diverse regions.Patterns of genetic variations among 54 E.onukii samples unveil the population structure and evolutionary history across different tea-growing regions in China.Our results demonstrate that the genomic changes in key pathways,including those linked to metabolism,circadian rhythms,and immune system functions,may underlie the successful spread and adaptation of E.onukii.This work highlights the genetic and molecular basis underlying the evolutionary success of a species with broad economic impacts,and provides insights into insect adaptation to host plants,which will ultimately facilitate more sustainable pest management.展开更多
As a typical representative of global complex diseases,psoriasis has attracted widespread attention because of its high heritability,heterogeneity,and incidence.Environmentally induced activation of the inflammatory-i...As a typical representative of global complex diseases,psoriasis has attracted widespread attention because of its high heritability,heterogeneity,and incidence.Environmentally induced activation of the inflammatory-immune axis in patients with psoriasis relies on genetic regulation of genomic variation.The heritability of psoriasis exceeds 80%,and research of genomic variation in psoriasis is of great significance to the interpretation of the biological pathogenesis of the disease.The development of genome-wide association studies(GWASs)has provided a powerful means for the capture of psoriasis susceptibility genes.More than 100 psoriasis susceptibility loci have been captured,enabling humans to gain a breakthrough understanding of the genetics and traits of psoriasis.With the advancement of research methods,increasingly more genetic methodologies are being used to capture the locations and types of variants outside the scope of GWAS scanning,making up for the inclinations and deficiencies of traditional GWAS capture of gene loci in a more detailed manner.This review covers several decades of research on genomic variation in psoriasis,including GWASs in psoriasis,the capture of functional gene variant types,and the translation of genomic variation into precision medicine;summarizes the research progress of genomic variation in psoriasis;and provides a theoretical reference for future genetic-based research of the mechanisms underlying psoriasis.展开更多
Objective:To summarize the application value of copy number variant sequencing(CNV-seq)in the detection of fetal chromosome and cytomegalovirus load.Methods:The study analyzed the clinical basic data,relevant laborato...Objective:To summarize the application value of copy number variant sequencing(CNV-seq)in the detection of fetal chromosome and cytomegalovirus load.Methods:The study analyzed the clinical basic data,relevant laboratory tests,treatment process,and outcomes of three patients with positive cytomegalovirus load detected by CNV-seq for fetal chromosomes and cytomegalovirus load,and literature review was done simutaneoubly.Results:In all three cases,the amniotic fluid cytomegalovirus load was less than 105 Copies/ml,and there were no significant neurological abnormalities observed during pregnancy or postpartum follow-up.There is no literature review on the application of CNV-seq technology in the detection of cytomegalovirus infection,only literature reports on genome analysis of CMV-DNA in confirmed patients were available.Conclusion:CNV-seq can be used to detect cytomegalovirus load,which may have a certain degree of predictive value for fetal outcome.CNV-seq can simultaneously detect fetal chromosomes and pathogenic microorganisms,which is of great significance for the prevention and control of birth defects.展开更多
Approximately 20%of colorectal cancer(CRC)patients present with metastasis at diagnosis.Among Stage I-III CRC patients who undergo surgical resection,18%typically suffer from distal metastasis within the first three y...Approximately 20%of colorectal cancer(CRC)patients present with metastasis at diagnosis.Among Stage I-III CRC patients who undergo surgical resection,18%typically suffer from distal metastasis within the first three years following initial treatment.The median survival duration after the diagnosis of metastatic CRC(mCRC)is only 9 mo.mCRC is traditionally considered to be an advanced stage malignancy or is thought to be caused by incomplete resection of tumor tissue,allowing cancer cells to spread from primary to distant organs;however,increa-sing evidence suggests that the mCRC process can begin early in tumor development.CRC patients present with high heterogeneity and diverse cancer phenotypes that are classified on the basis of molecular and morphological alterations.Different genomic and nongenomic events can induce subclone diversity,which leads to cancer and metastasis.Throughout the course of mCRC,metastatic cascades are associated with invasive cancer cell migration through the circulatory system,extravasation,distal seeding,dormancy,and reactivation,with each step requiring specific molecular functions.However,cancer cells presenting neoantigens can be recognized and eliminated by the immune system.In this review,we explain the biological factors that drive CRC metastasis,namely,genomic instability,epigenetic instability,the metastatic cascade,the cancer-immunity cycle,and external lifestyle factors.Despite remarkable progress in CRC research,the role of molecular classification in therapeutic intervention remains unclear.This review shows the driving factors of mCRC which may help in identifying potential candidate biomarkers that can improve the diagnosis and early detection of mCRC cases.展开更多
Objective:Histology grade,subtypes and TNM stage of lung adenocarcinomas are useful predictors of prognosis and survival.The aim of the study was to investigate the relationship between chromosomal instability,morphol...Objective:Histology grade,subtypes and TNM stage of lung adenocarcinomas are useful predictors of prognosis and survival.The aim of the study was to investigate the relationship between chromosomal instability,morphological subtypes and the grading system used in lung non-mucinous adenocarcinoma(LNMA).Methods:We developed a whole genome copy number variation(WGCNV)scoring system and applied next generation sequencing to evaluate CNVs present in 91 LNMA tumor samples.Results:Higher histological grades,aggressive subtypes and more advanced TNM staging were associated with an increased WGCNV score,particularly in CNV regions enriched for tumor suppressor genes and oncogenes.In addition,we demonstrate that 24-chromosome CNV profiling can be performed reliably from specific cell types(<100 cells)isolated by sample laser capture microdissection.Conclusions:Our findings suggest that the WGCNV scoring system we developed may have potential value as an adjunct test for predicting the prognosis of patients diagnosed with LNMA.展开更多
Advances in DNA sequencing technology have sparked a genomics revolution,driving breakthroughs in plant genetics and crop breeding.Recently,the focus has shifted from cataloging genetic diversity in plants to explorin...Advances in DNA sequencing technology have sparked a genomics revolution,driving breakthroughs in plant genetics and crop breeding.Recently,the focus has shifted from cataloging genetic diversity in plants to exploring their functional significance and delivering beneficial alleles for crop improvement.This transformation has been facilitated by the increasing adoption of whole-genome resequencing.In this review,we summarize the current progress of population-based genome resequencing studies and how these studies affect crop breeding.A total of 187 land plants from 163 countries have been resequenced,comprising 54413 accessions.As part of resequencing efforts 367 traits have been surveyed and 86 genome-wide association studies have been conducted.Economically important crops,particularly cereals,vegetables,and legumes,have dominated the resequencing efforts,leaving a gap in 49 orders,including Lycopodiales,Liliales,Acorales,Austrobaileyales,and Commelinales.The resequenced germplasm is distributed across diverse geographic locations,providing a global perspective on plant genomics.We highlight genes that have been selected during domestication,or associated with agronomic traits,and form a repository of candidate genes for future research and application.Despite the opportunities for cross-species comparative genomics,many population genomic datasets are not accessible,impeding secondary analyses.We call for a more open and collaborative approach to population genomics that promotes data sharing and encourages contribution-based credit policy.The number of plant genome resequencing studies will continue to rise with the decreasing DNA sequencing costs,coupled with advances in analysis and computational technologies.This expansion,in terms of both scale and quality,holds promise for deeper insights into plant trait genetics and breeding design.展开更多
Plants produce a remarkable diversity of structurally and functionally diverse natural chemicals that serve as adaptive compounds throughout their life cycles.However,unlocking this metabolic diversity is significantl...Plants produce a remarkable diversity of structurally and functionally diverse natural chemicals that serve as adaptive compounds throughout their life cycles.However,unlocking this metabolic diversity is significantly impeded by the size,complexity,and abundant repetitive elements of typical plant genomes.As genome sequencing becomes routine,we anticipate that links between metabolic diversity and genetic variation will be strengthened.In addition,an ever-increasing number of plant genomes have revealed that biosynthetic gene clusters are not only a hallmark of microbes and fungi;gene clusters for various classes of compounds have also been found in plants,and many are associated with important agronomic traits.We present recent examples of plant metabolic diversification that have been discovered through the exploration and exploitation of various genomic and pan-genomic data.We also draw attention to the fundamental genomic and pan-genomic basis of plant chemodiversity and discuss challenges and future perspectives for investigating metabolic diversity in the coming pan-genomics era.展开更多
Subject Code:C01The United Nations estimates that world population will increase to 11.2billion in the year 2100.Vegetative oil that serves as one of the major energy resources is essential to feeding human beings.Oil...Subject Code:C01The United Nations estimates that world population will increase to 11.2billion in the year 2100.Vegetative oil that serves as one of the major energy resources is essential to feeding human beings.Oil palm(Elaeis guineensis Jacq,Elaeis from ancient Greek,meaning'oil')produces more than 13times the yield of oil/year/hectare of soybean,one major human annual oil crop.In consequence,it represents a展开更多
On January 22,2020,China National Center for Bioinformation(CNCB)released the 2019 Novel Coronavirus Resource(2019nCoVR),an open-access information resource for the severe acute respiratory syndrome coronavirus 2(SARS...On January 22,2020,China National Center for Bioinformation(CNCB)released the 2019 Novel Coronavirus Resource(2019nCoVR),an open-access information resource for the severe acute respiratory syndrome coronavirus 2(SARS-CoV-2).2019nCoVR features a comprehensive integration of sequence and clinical information for all publicly available SARS-CoV-2 isolates,which are manually curated with value-added annotations and quality evaluated by an automated in-house pipeline.Of particular note,2019nCoVR offers systematic analyses to generate a dynamic landscape of SARS-CoV-2 genomic variations at a global scale.It provides all identified variants and their detailed statistics for each virus isolate,and congregates the quality score,functional annotation,and population frequency for each variant.Spatiotemporal change for each variant can be visualized and historical viral haplotype network maps for the course of the outbreak are also generated based on all complete and high-quality genomes available.Moreover,2019nCoVR provides a full collection of SARS-CoV-2 relevant literature on the coronavirus disease 2019(COVID-19),including published papers from PubMed as well as preprints from services such as bioRxiv and medRxiv through Europe PMC.Furthermore,by linking with relevant databases in CNCB,2019nCoVR offers data submission services for raw sequence reads and assembled genomes,and data sharing with NCBI.Collectively,SARS-CoV-2 is updated daily to collect the latest information on genome sequences,variants,haplotypes,and literature for a timely reflection,making 2019nCoVR a valuable resource for the global research community.2019nCoVR is accessible at https://bigd.big.ac.cn/ncov/.展开更多
Accurately identifying DNA polymorphisms can bridge the gap between phenotypes and genotypes and is essential for molecular marker assisted genetic studies.Genome complexities,including large-scale structural variatio...Accurately identifying DNA polymorphisms can bridge the gap between phenotypes and genotypes and is essential for molecular marker assisted genetic studies.Genome complexities,including large-scale structural variations,bring great challenges to bioinformatic analysis for obtaining high-confidence genomic variants,as sequence differences between non-allelic loci of two or more genomes can be misinterpreted as polymorphisms.It is important to correctly filter out artificial variants to avoid false genotyping or estimation of allele frequencies.Here,we present an efficient and effective framework,inGAP-family,to discover,filter,and visualize DNA polymorphisms and structural variants(SVs)from alignment of short reads.Applying this method to polymorphism detection on real datasets shows that elimination of artificial variants greatly facilitates the precise identification of meiotic recombination points as well as causal mutations in mutant genomes or quantitative trait loci.In addition,inGAP-family provides a user-friendly graphical interface for detecting polymorphisms and SVs,further evaluating predicted variants and identifying mutations related to genotypes.It is accessible at https://sourceforge.net/projects/ingap-family/.展开更多
基金supported by the National Natural Science Foundation of China(NSFC,31970564,32000397,32171982)the Fundamental Research Funds for the Central Universities(2662023PY004)。
文摘"Synthetic"allopolyploids recreated by interspecific hybridization play an important role in providing novel genomic variation for crop improvement.Such synthetic allopolyploids often undergo rapid genomic structural variation(SV).However,how such SV arises,is inherited and fixed,and how it affects important traits,has rarely been comprehensively and quantitively studied in advanced generation synthetic lines.A better understanding of these processes will aid breeders in knowing how to best utilize synthetic allopolyploids in breeding programs.Here,we analyzed three genetic mapping populations(735 DH lines)derived from crosses between advanced synthetic and conventional Brassica napus(rapeseed)lines,using whole-genome sequencing to determine genome composition.We observed high tolerance of large structural variants,particularly toward the telomeres,and preferential selection for balanced homoeologous exchanges(duplication/deletion events between the A and C genomes resulting in retention of gene/chromosome dosage between homoeologous chromosome pairs),including stable events involving whole chromosomes("pseudoeuploidy").Given the experimental design(all three populations shared a common parent),we were able to observe that parental SV was regularly inherited,showed genetic hitchhiking effects on segregation,and was one of the major factors inducing adjacent novel and larger SV.Surprisingly,novel SV occurred at low frequencies with no significant impacts on observed fertility and yield-related traits in the advanced generation synthetic lines.However,incorporating genome-wide SV in linkage mapping explained significantly more genetic variance for traits.Our results provide a framework for detecting and understanding the occurrence and inheritance of genomic SV in breeding programs,and support the use of synthetic parents as an important source of novel trait variation.
基金supported by the National Key R&D Program of China(Grant No.2019YFD1002100)the Natural Science Foundation of Fujian Province,China(Grant No.2020J01525)+1 种基金the Fujian Agriculture and Forestry University Construction Project for Technological Innovation and Service System of Tea Industry,China(Grant No.K1520005A03)the Key International Science and Technology cooperation Project of China(Grant No.2016YFE0102100).
文摘Tea green leafhopper(TGL),Empoasca onukii,is of biological and economic interest.Despite numerous studies,the mechanisms underlying its adaptation and evolution remain enigmatic.Here,we use previously untapped genome and population genetics approaches to examine how the pest adapted to different environmental variables and thus has expanded geographically.We complete a chromosome-level assembly and annotation of the E.onukii genome,showing notable expansions of gene families associated with adaptation to chemoreception and detoxification.Genomic signals indicating balancing selection highlight metabolic pathways involved in adaptation to a wide range of tea varieties grown across ecologically diverse regions.Patterns of genetic variations among 54 E.onukii samples unveil the population structure and evolutionary history across different tea-growing regions in China.Our results demonstrate that the genomic changes in key pathways,including those linked to metabolism,circadian rhythms,and immune system functions,may underlie the successful spread and adaptation of E.onukii.This work highlights the genetic and molecular basis underlying the evolutionary success of a species with broad economic impacts,and provides insights into insect adaptation to host plants,which will ultimately facilitate more sustainable pest management.
文摘As a typical representative of global complex diseases,psoriasis has attracted widespread attention because of its high heritability,heterogeneity,and incidence.Environmentally induced activation of the inflammatory-immune axis in patients with psoriasis relies on genetic regulation of genomic variation.The heritability of psoriasis exceeds 80%,and research of genomic variation in psoriasis is of great significance to the interpretation of the biological pathogenesis of the disease.The development of genome-wide association studies(GWASs)has provided a powerful means for the capture of psoriasis susceptibility genes.More than 100 psoriasis susceptibility loci have been captured,enabling humans to gain a breakthrough understanding of the genetics and traits of psoriasis.With the advancement of research methods,increasingly more genetic methodologies are being used to capture the locations and types of variants outside the scope of GWAS scanning,making up for the inclinations and deficiencies of traditional GWAS capture of gene loci in a more detailed manner.This review covers several decades of research on genomic variation in psoriasis,including GWASs in psoriasis,the capture of functional gene variant types,and the translation of genomic variation into precision medicine;summarizes the research progress of genomic variation in psoriasis;and provides a theoretical reference for future genetic-based research of the mechanisms underlying psoriasis.
基金Hainan Natural Science Foundation(821RC699)Hainan Natural Science Foundation(822RC825)+1 种基金Hainan Provincial Health Industry Research Project(22A200242)Key R&D Plan of Hainan Province(ZDYF2020225)。
文摘Objective:To summarize the application value of copy number variant sequencing(CNV-seq)in the detection of fetal chromosome and cytomegalovirus load.Methods:The study analyzed the clinical basic data,relevant laboratory tests,treatment process,and outcomes of three patients with positive cytomegalovirus load detected by CNV-seq for fetal chromosomes and cytomegalovirus load,and literature review was done simutaneoubly.Results:In all three cases,the amniotic fluid cytomegalovirus load was less than 105 Copies/ml,and there were no significant neurological abnormalities observed during pregnancy or postpartum follow-up.There is no literature review on the application of CNV-seq technology in the detection of cytomegalovirus infection,only literature reports on genome analysis of CMV-DNA in confirmed patients were available.Conclusion:CNV-seq can be used to detect cytomegalovirus load,which may have a certain degree of predictive value for fetal outcome.CNV-seq can simultaneously detect fetal chromosomes and pathogenic microorganisms,which is of great significance for the prevention and control of birth defects.
文摘Approximately 20%of colorectal cancer(CRC)patients present with metastasis at diagnosis.Among Stage I-III CRC patients who undergo surgical resection,18%typically suffer from distal metastasis within the first three years following initial treatment.The median survival duration after the diagnosis of metastatic CRC(mCRC)is only 9 mo.mCRC is traditionally considered to be an advanced stage malignancy or is thought to be caused by incomplete resection of tumor tissue,allowing cancer cells to spread from primary to distant organs;however,increa-sing evidence suggests that the mCRC process can begin early in tumor development.CRC patients present with high heterogeneity and diverse cancer phenotypes that are classified on the basis of molecular and morphological alterations.Different genomic and nongenomic events can induce subclone diversity,which leads to cancer and metastasis.Throughout the course of mCRC,metastatic cascades are associated with invasive cancer cell migration through the circulatory system,extravasation,distal seeding,dormancy,and reactivation,with each step requiring specific molecular functions.However,cancer cells presenting neoantigens can be recognized and eliminated by the immune system.In this review,we explain the biological factors that drive CRC metastasis,namely,genomic instability,epigenetic instability,the metastatic cascade,the cancer-immunity cycle,and external lifestyle factors.Despite remarkable progress in CRC research,the role of molecular classification in therapeutic intervention remains unclear.This review shows the driving factors of mCRC which may help in identifying potential candidate biomarkers that can improve the diagnosis and early detection of mCRC cases.
基金grants from Beijing Hospital Key Research Program(121 Research Program,No.BJ2019-195)。
文摘Objective:Histology grade,subtypes and TNM stage of lung adenocarcinomas are useful predictors of prognosis and survival.The aim of the study was to investigate the relationship between chromosomal instability,morphological subtypes and the grading system used in lung non-mucinous adenocarcinoma(LNMA).Methods:We developed a whole genome copy number variation(WGCNV)scoring system and applied next generation sequencing to evaluate CNVs present in 91 LNMA tumor samples.Results:Higher histological grades,aggressive subtypes and more advanced TNM staging were associated with an increased WGCNV score,particularly in CNV regions enriched for tumor suppressor genes and oncogenes.In addition,we demonstrate that 24-chromosome CNV profiling can be performed reliably from specific cell types(<100 cells)isolated by sample laser capture microdissection.Conclusions:Our findings suggest that the WGCNV scoring system we developed may have potential value as an adjunct test for predicting the prognosis of patients diagnosed with LNMA.
基金supported by the National Key Research and Development Program of China(2020YFE0202300)Science and Technology Major Project of Guangxi(GuiKeAA20108005-2)+1 种基金Guangdong Innovation Research Team Fund(grant number:2014ZT05S078)National Key Research and Development Program of China(2019YFA0707000).No conflict of interest declared.
文摘Advances in DNA sequencing technology have sparked a genomics revolution,driving breakthroughs in plant genetics and crop breeding.Recently,the focus has shifted from cataloging genetic diversity in plants to exploring their functional significance and delivering beneficial alleles for crop improvement.This transformation has been facilitated by the increasing adoption of whole-genome resequencing.In this review,we summarize the current progress of population-based genome resequencing studies and how these studies affect crop breeding.A total of 187 land plants from 163 countries have been resequenced,comprising 54413 accessions.As part of resequencing efforts 367 traits have been surveyed and 86 genome-wide association studies have been conducted.Economically important crops,particularly cereals,vegetables,and legumes,have dominated the resequencing efforts,leaving a gap in 49 orders,including Lycopodiales,Liliales,Acorales,Austrobaileyales,and Commelinales.The resequenced germplasm is distributed across diverse geographic locations,providing a global perspective on plant genomics.We highlight genes that have been selected during domestication,or associated with agronomic traits,and form a repository of candidate genes for future research and application.Despite the opportunities for cross-species comparative genomics,many population genomic datasets are not accessible,impeding secondary analyses.We call for a more open and collaborative approach to population genomics that promotes data sharing and encourages contribution-based credit policy.The number of plant genome resequencing studies will continue to rise with the decreasing DNA sequencing costs,coupled with advances in analysis and computational technologies.This expansion,in terms of both scale and quality,holds promise for deeper insights into plant trait genetics and breeding design.
基金The Z.L.laboratory is supported by a startup grant provided by Shanghai Jiao Tong University,School of Agriculture and Biology and the Shanghai Pujiang Program(20PJ1405900).
文摘Plants produce a remarkable diversity of structurally and functionally diverse natural chemicals that serve as adaptive compounds throughout their life cycles.However,unlocking this metabolic diversity is significantly impeded by the size,complexity,and abundant repetitive elements of typical plant genomes.As genome sequencing becomes routine,we anticipate that links between metabolic diversity and genetic variation will be strengthened.In addition,an ever-increasing number of plant genomes have revealed that biosynthetic gene clusters are not only a hallmark of microbes and fungi;gene clusters for various classes of compounds have also been found in plants,and many are associated with important agronomic traits.We present recent examples of plant metabolic diversification that have been discovered through the exploration and exploitation of various genomic and pan-genomic data.We also draw attention to the fundamental genomic and pan-genomic basis of plant chemodiversity and discuss challenges and future perspectives for investigating metabolic diversity in the coming pan-genomics era.
文摘Subject Code:C01The United Nations estimates that world population will increase to 11.2billion in the year 2100.Vegetative oil that serves as one of the major energy resources is essential to feeding human beings.Oil palm(Elaeis guineensis Jacq,Elaeis from ancient Greek,meaning'oil')produces more than 13times the yield of oil/year/hectare of soybean,one major human annual oil crop.In consequence,it represents a
基金This work was supported by grants from the Strategic PriorityResearch Program of Chinese Academy of Sciences(GrantNos.XDA19090116,XDA19050302,and XDB38030400)awarded to SS,ZZ,and MLthe National Key R&D Programof China(Grant Nos.2020YFC0848900,2020YFC0847000,2016YFE0206600,and 2017YFC0907502)+5 种基金the 13th Five-yearInformatization Plan of Chinese Academy of Sciences(GrantNo.XXH13505-05)Genomics Data Center Construction ofChinese Academy of Sciences(Grant No.XXH-13514-0202)the Open Biodiversity and Health Big Data Programme ofInternational Union of Biological Sciences,International Part-nership Program of Chinese Academy of Sciences(Grant No.153F11KYSB20160008)the Professional Association of theAlliance of International Science Organizations(Grant No.ANSO-PA-2020-07)This work was also supported by KCWong Education Foundation to ZZthe YouthInnovation Promotion Association of Chinese Academy ofSciences(Grant Nos.2017141 and 2019104)awarded to SSand ML.
文摘On January 22,2020,China National Center for Bioinformation(CNCB)released the 2019 Novel Coronavirus Resource(2019nCoVR),an open-access information resource for the severe acute respiratory syndrome coronavirus 2(SARS-CoV-2).2019nCoVR features a comprehensive integration of sequence and clinical information for all publicly available SARS-CoV-2 isolates,which are manually curated with value-added annotations and quality evaluated by an automated in-house pipeline.Of particular note,2019nCoVR offers systematic analyses to generate a dynamic landscape of SARS-CoV-2 genomic variations at a global scale.It provides all identified variants and their detailed statistics for each virus isolate,and congregates the quality score,functional annotation,and population frequency for each variant.Spatiotemporal change for each variant can be visualized and historical viral haplotype network maps for the course of the outbreak are also generated based on all complete and high-quality genomes available.Moreover,2019nCoVR provides a full collection of SARS-CoV-2 relevant literature on the coronavirus disease 2019(COVID-19),including published papers from PubMed as well as preprints from services such as bioRxiv and medRxiv through Europe PMC.Furthermore,by linking with relevant databases in CNCB,2019nCoVR offers data submission services for raw sequence reads and assembled genomes,and data sharing with NCBI.Collectively,SARS-CoV-2 is updated daily to collect the latest information on genome sequences,variants,haplotypes,and literature for a timely reflection,making 2019nCoVR a valuable resource for the global research community.2019nCoVR is accessible at https://bigd.big.ac.cn/ncov/.
基金supported by grants from the National Natural Science Foundation of China(Grant Nos.32070247 and 31770244 to JQ)funds from the State Key Laboratory of Genetic Engineering at Fudan University,China.
文摘Accurately identifying DNA polymorphisms can bridge the gap between phenotypes and genotypes and is essential for molecular marker assisted genetic studies.Genome complexities,including large-scale structural variations,bring great challenges to bioinformatic analysis for obtaining high-confidence genomic variants,as sequence differences between non-allelic loci of two or more genomes can be misinterpreted as polymorphisms.It is important to correctly filter out artificial variants to avoid false genotyping or estimation of allele frequencies.Here,we present an efficient and effective framework,inGAP-family,to discover,filter,and visualize DNA polymorphisms and structural variants(SVs)from alignment of short reads.Applying this method to polymorphism detection on real datasets shows that elimination of artificial variants greatly facilitates the precise identification of meiotic recombination points as well as causal mutations in mutant genomes or quantitative trait loci.In addition,inGAP-family provides a user-friendly graphical interface for detecting polymorphisms and SVs,further evaluating predicted variants and identifying mutations related to genotypes.It is accessible at https://sourceforge.net/projects/ingap-family/.