Genomic data serve as an invaluable resource for unraveling the intricacies of the higher plant systems,including the constituent elements within and among species.Through various efforts in genomic data archiving,int...Genomic data serve as an invaluable resource for unraveling the intricacies of the higher plant systems,including the constituent elements within and among species.Through various efforts in genomic data archiving,integrative analysis and value-added curation,the National Genomics Data Center(NGDC),which is a part of the China National Center for Bioinformation(CNCB),has successfully established and currently maintains a vast amount of database resources.This dedicated initiative of the NGDC facilitates a data-rich ecosystem that greatly strengthens and supports genomic research efforts.Here,we present a comprehensive overview of central repositories dedicated to archiving,presenting,and sharing plant omics data,introduce knowledgebases focused on variants or gene-based functional insights,highlight species-specific multiple omics database resources,and briefly review the online application tools.We intend that this review can be used as a guide map for plant researchers wishing to select effective data resources from the NGDC for their specific areas of study.展开更多
Dear Editor,As one of the most important crops to supply the majority of plant oil and protein for the whole world,soybean is facing an increasing global demand.The reference genome of accession"Williams82"o...Dear Editor,As one of the most important crops to supply the majority of plant oil and protein for the whole world,soybean is facing an increasing global demand.The reference genome of accession"Williams82"opened the gate of genomics research in soybean(Schmutz et al.,2010).After that,vast multi-omics data were generated,thereby providing valuable resources for functional study and molecular breeding.Parts of these data have been collected in different soybean databases(see details in Supplemental Table 1),such as Soybase(Grant et al.,2010)and SoyKB(Joshi et al.,2012),which made valuable efforts to facilitate the wide utility of these data.Nevertheless,these existing databases poorly tackled multi-omics data integration and interactivity for soybean,provoking tremendous challenges for researchers to deal with these big omics data,particularly considering the unprecedented rate of data growth(Yang et al.,2021).Thus,constructing an integrated multi-omics database for soybean that provides a one-stop solution for big data mining with friendly interactivity is highly desired.展开更多
On January 22,2020,China National Center for Bioinformation(CNCB)released the 2019 Novel Coronavirus Resource(2019nCoVR),an open-access information resource for the severe acute respiratory syndrome coronavirus 2(SARS...On January 22,2020,China National Center for Bioinformation(CNCB)released the 2019 Novel Coronavirus Resource(2019nCoVR),an open-access information resource for the severe acute respiratory syndrome coronavirus 2(SARS-CoV-2).2019nCoVR features a comprehensive integration of sequence and clinical information for all publicly available SARS-CoV-2 isolates,which are manually curated with value-added annotations and quality evaluated by an automated in-house pipeline.Of particular note,2019nCoVR offers systematic analyses to generate a dynamic landscape of SARS-CoV-2 genomic variations at a global scale.It provides all identified variants and their detailed statistics for each virus isolate,and congregates the quality score,functional annotation,and population frequency for each variant.Spatiotemporal change for each variant can be visualized and historical viral haplotype network maps for the course of the outbreak are also generated based on all complete and high-quality genomes available.Moreover,2019nCoVR provides a full collection of SARS-CoV-2 relevant literature on the coronavirus disease 2019(COVID-19),including published papers from PubMed as well as preprints from services such as bioRxiv and medRxiv through Europe PMC.Furthermore,by linking with relevant databases in CNCB,2019nCoVR offers data submission services for raw sequence reads and assembled genomes,and data sharing with NCBI.Collectively,SARS-CoV-2 is updated daily to collect the latest information on genome sequences,variants,haplotypes,and literature for a timely reflection,making 2019nCoVR a valuable resource for the global research community.2019nCoVR is accessible at https://bigd.big.ac.cn/ncov/.展开更多
Domestic rice(Oryza sativa L.) is one of the most important cereal crops, feeding a large number of worldwide populations. Along with various high-throughput genome sequencing projects, rice genomics has been making g...Domestic rice(Oryza sativa L.) is one of the most important cereal crops, feeding a large number of worldwide populations. Along with various high-throughput genome sequencing projects, rice genomics has been making great headway toward direct ?eld applications of basic research advances in understanding the molecular mechanisms of agronomical traits and utilizing diverse germplasm resources. Here, we brie?y review its achievements over the past two decades and present the potential for its bright future.展开更多
COVID-19 has swept globally and Pakistan is no exception.To investigate the initial introductions and transmissions of the SARS-CoV-2 in Pakistan,we performed the largest genomic epidemiology study of COVID-19 in Paki...COVID-19 has swept globally and Pakistan is no exception.To investigate the initial introductions and transmissions of the SARS-CoV-2 in Pakistan,we performed the largest genomic epidemiology study of COVID-19 in Pakistan and generated 150 complete SARS-CoV-2 genome sequences from samples collected from March 16 to June 1,2020.We identified a total of 347 mutated positions,31 of which were over-represented in Pakistan.Meanwhile,we found over 1000 intra-host single-nucleotide variants(iSNVs).Several of them occurred concurrently,indicating possible interactions among them or coevolution.Some of the high-frequency iSNVs in Pakistan were not observed in the global population,suggesting strong purifying selections.The genomic epidemiology revealed five distinctive spreading clusters.The largest cluster consisted of 74 viruses which were derived from different geographic locations of Pakistan and formed a deep hierarchical structure,indicating an extensive and persistent nation-wide transmission of the virus that was probably attributed to a signature mutation(G8371T in ORF1ab)of this cluster.Furthermore,28 putative international introductions were identified,several of which are consistent with the epidemiological investigations.In all,this study has inferred the possible pathways of introductions and transmissions of SARS-CoV-2 in Pakistan,which could aid ongoing and future viral surveillance and COVID-19 control.展开更多
基金supported by Technological Innovation 2030 (2022ZD0401701)National Natural Science Foundation of China (32000475,32030021)+1 种基金Strategic Priority Research Program of the Chinese Academy of Sciences (XDA24040201)Youth Innovation Promotion Association of the Chinese Academy of Sciences (Y2021038).
文摘Genomic data serve as an invaluable resource for unraveling the intricacies of the higher plant systems,including the constituent elements within and among species.Through various efforts in genomic data archiving,integrative analysis and value-added curation,the National Genomics Data Center(NGDC),which is a part of the China National Center for Bioinformation(CNCB),has successfully established and currently maintains a vast amount of database resources.This dedicated initiative of the NGDC facilitates a data-rich ecosystem that greatly strengthens and supports genomic research efforts.Here,we present a comprehensive overview of central repositories dedicated to archiving,presenting,and sharing plant omics data,introduce knowledgebases focused on variants or gene-based functional insights,highlight species-specific multiple omics database resources,and briefly review the online application tools.We intend that this review can be used as a guide map for plant researchers wishing to select effective data resources from the NGDC for their specific areas of study.
基金supported by the Strategic Priority Research Program of the Chinese Academy of Sciences(XDA24000000,XDA19050302,and XDA24040201)the Science and Technology Innovation 2030-Major Project(2022ZD04017)+5 种基金the National Natural Science Foundation of China(32030021,32000475,and 32201775)the National Key Research and Development Program of China(2021YFF1001201)the Taishan Scholars Programthe Xplorer Prize Awardthe Youth Innovation Promotion Association of the Chinese Academy of Sciences(Y2021038)the China National Postdoctoral Program for innovative Talents(BX2021354).
文摘Dear Editor,As one of the most important crops to supply the majority of plant oil and protein for the whole world,soybean is facing an increasing global demand.The reference genome of accession"Williams82"opened the gate of genomics research in soybean(Schmutz et al.,2010).After that,vast multi-omics data were generated,thereby providing valuable resources for functional study and molecular breeding.Parts of these data have been collected in different soybean databases(see details in Supplemental Table 1),such as Soybase(Grant et al.,2010)and SoyKB(Joshi et al.,2012),which made valuable efforts to facilitate the wide utility of these data.Nevertheless,these existing databases poorly tackled multi-omics data integration and interactivity for soybean,provoking tremendous challenges for researchers to deal with these big omics data,particularly considering the unprecedented rate of data growth(Yang et al.,2021).Thus,constructing an integrated multi-omics database for soybean that provides a one-stop solution for big data mining with friendly interactivity is highly desired.
基金This work was supported by grants from the Strategic PriorityResearch Program of Chinese Academy of Sciences(GrantNos.XDA19090116,XDA19050302,and XDB38030400)awarded to SS,ZZ,and MLthe National Key R&D Programof China(Grant Nos.2020YFC0848900,2020YFC0847000,2016YFE0206600,and 2017YFC0907502)+5 种基金the 13th Five-yearInformatization Plan of Chinese Academy of Sciences(GrantNo.XXH13505-05)Genomics Data Center Construction ofChinese Academy of Sciences(Grant No.XXH-13514-0202)the Open Biodiversity and Health Big Data Programme ofInternational Union of Biological Sciences,International Part-nership Program of Chinese Academy of Sciences(Grant No.153F11KYSB20160008)the Professional Association of theAlliance of International Science Organizations(Grant No.ANSO-PA-2020-07)This work was also supported by KCWong Education Foundation to ZZthe YouthInnovation Promotion Association of Chinese Academy ofSciences(Grant Nos.2017141 and 2019104)awarded to SSand ML.
文摘On January 22,2020,China National Center for Bioinformation(CNCB)released the 2019 Novel Coronavirus Resource(2019nCoVR),an open-access information resource for the severe acute respiratory syndrome coronavirus 2(SARS-CoV-2).2019nCoVR features a comprehensive integration of sequence and clinical information for all publicly available SARS-CoV-2 isolates,which are manually curated with value-added annotations and quality evaluated by an automated in-house pipeline.Of particular note,2019nCoVR offers systematic analyses to generate a dynamic landscape of SARS-CoV-2 genomic variations at a global scale.It provides all identified variants and their detailed statistics for each virus isolate,and congregates the quality score,functional annotation,and population frequency for each variant.Spatiotemporal change for each variant can be visualized and historical viral haplotype network maps for the course of the outbreak are also generated based on all complete and high-quality genomes available.Moreover,2019nCoVR provides a full collection of SARS-CoV-2 relevant literature on the coronavirus disease 2019(COVID-19),including published papers from PubMed as well as preprints from services such as bioRxiv and medRxiv through Europe PMC.Furthermore,by linking with relevant databases in CNCB,2019nCoVR offers data submission services for raw sequence reads and assembled genomes,and data sharing with NCBI.Collectively,SARS-CoV-2 is updated daily to collect the latest information on genome sequences,variants,haplotypes,and literature for a timely reflection,making 2019nCoVR a valuable resource for the global research community.2019nCoVR is accessible at https://bigd.big.ac.cn/ncov/.
基金support from the Youth Innovation Promotion Association of the Chinese Academy of Sciences,China (Grant No.2017141) awarded to SSthe Strategic Priority Research Program (Grant No.XDA08010304)+1 种基金Key Research Program of Frontier Sciences (Grant No.QYZDY-SSW-SMC017)R&D Projects of Scientific Research Equipment Programs (Grant Nos.YZ201568 and YZ201402) of the Chinese Academy of Sciences,China awarded to JY
文摘Domestic rice(Oryza sativa L.) is one of the most important cereal crops, feeding a large number of worldwide populations. Along with various high-throughput genome sequencing projects, rice genomics has been making great headway toward direct ?eld applications of basic research advances in understanding the molecular mechanisms of agronomical traits and utilizing diverse germplasm resources. Here, we brie?y review its achievements over the past two decades and present the potential for its bright future.
基金supported by grants from the National Key R&D Program of China(Grant Nos.2021YFC0863300,2020YFC0848900,and 2016YFE0206600)the National Natural Science Foundation of China(Grant No.82161148009)+3 种基金the Strategic Priority Research Program of Chinese Academy of Sciences,China(Grant Nos.XDA19090116 and XDB38060100)the Open Biodiversity and Health Big Data Programme of International Union of Biological Sciences,International Partnership Program of Chinese Academy of Sciences(Grant No.153F11KYSB20160008)the Professional Association of the Alliance of International Science Organizations(Grant No.ANSO-PA-2020-07)the Youth Innovation Promotion Association of Chinese Academy of Sciences(Grant No.2017141)。
文摘COVID-19 has swept globally and Pakistan is no exception.To investigate the initial introductions and transmissions of the SARS-CoV-2 in Pakistan,we performed the largest genomic epidemiology study of COVID-19 in Pakistan and generated 150 complete SARS-CoV-2 genome sequences from samples collected from March 16 to June 1,2020.We identified a total of 347 mutated positions,31 of which were over-represented in Pakistan.Meanwhile,we found over 1000 intra-host single-nucleotide variants(iSNVs).Several of them occurred concurrently,indicating possible interactions among them or coevolution.Some of the high-frequency iSNVs in Pakistan were not observed in the global population,suggesting strong purifying selections.The genomic epidemiology revealed five distinctive spreading clusters.The largest cluster consisted of 74 viruses which were derived from different geographic locations of Pakistan and formed a deep hierarchical structure,indicating an extensive and persistent nation-wide transmission of the virus that was probably attributed to a signature mutation(G8371T in ORF1ab)of this cluster.Furthermore,28 putative international introductions were identified,several of which are consistent with the epidemiological investigations.In all,this study has inferred the possible pathways of introductions and transmissions of SARS-CoV-2 in Pakistan,which could aid ongoing and future viral surveillance and COVID-19 control.