Fragaria vesca,commonly known as wild or woodland strawberry,is the most widely distributed diploid Fragaria species and is native to Europe and Asia.Because of its small plant size,low heterozygosity,and relative eas...Fragaria vesca,commonly known as wild or woodland strawberry,is the most widely distributed diploid Fragaria species and is native to Europe and Asia.Because of its small plant size,low heterozygosity,and relative ease of genetic transformation,F.vesca has been a model plant for fruit research since the publication of its Illumina-based genome in 2011.However,its genomic contribution to octoploid cultivated strawberry remains a long-standing question.Here,we de novo assembled and annotated a telomere-to-telomere,gap-free genome of F.vesca‘Hawaii 4’,with all seven chromosomes assembled into single contigs,providing the highest completeness and assembly quality to date.The gap-free genome is 220785082 bp in length and encodes 36173 protein-coding gene models,including 1153 newly annotated genes.All 14 telomeres and seven centromeres were annotated within the seven chromosomes.Among the three previously recognized wild diploid strawberry ancestors,F.vesca,F.iinumae,and F.viridis,phylogenomic analysis showed that F.vesca and F.viridis are the ancestors of the cultivated octoploid strawberry F.×ananassa,and F.vesca is its closest relative.Three subgenomes of F.×ananassa belong to the F.vesca group,and one is sister to F.viridis.We anticipate that this high-quality,telomere-to-telomere,gap-free F.vesca genome,combined with our phylogenomic inference of the origin of cultivated strawberry,will provide insight into the genomic evolution of Fragaria and facilitate strawberry genetics and molecular breeding.展开更多
Coriander(Coriandrum sativum L.),also known as cilantro,is a globally important vegetable and spice crop.Its genome and that of carrot are models for studying the evolution of the Apiaceae family.Here,we developed the...Coriander(Coriandrum sativum L.),also known as cilantro,is a globally important vegetable and spice crop.Its genome and that of carrot are models for studying the evolution of the Apiaceae family.Here,we developed the Coriander Genomics Database(CGDB,http://cgdb.bio2db.com/)to collect,store,and integrate the genomic,transcriptomic,metabolic,functional annotation,and repeat sequence data of coriander and carrot to serve as a central online platform for Apiaceae and other related plants.Using these data sets in the CGDB,we intriguingly found that seven transcription factor(TF)families showed significantly greater numbers of members in the coriander genome than in the carrot genome.The highest ratio of the numbers of MADS TFs between coriander and carrot reached 3.15,followed by those for tubby protein(TUB)and heat shock factors.As a demonstration of CGDB applications,we identified 17 TUB family genes and conducted systematic comparative and evolutionary analyses.RNA-seq data deposited in the CGDB also suggest dose compensation effects of gene expression in coriander.CGDB allows bulk downloading,significance searches,genome browser analyses,and BLAST searches for comparisons between coriander and other plants regarding genomics,gene families,gene collinearity,gene expression,and the metabolome.A detailed user manual and contact information are also available to provide support to the scientific research community and address scientific questions.CGDB will be continuously updated,and new data will be integrated for comparative and functional genomic analysis in Apiaceae and other related plants.展开更多
Tea,coffee,and cocoa are the three most popular nonalcoholic beverages in the world and have extremely high economic and cultural value.The genomes of four tea plant varieties have recently been sequenced,but there is...Tea,coffee,and cocoa are the three most popular nonalcoholic beverages in the world and have extremely high economic and cultural value.The genomes of four tea plant varieties have recently been sequenced,but there is some debate regarding the characterization of a whole-genome duplication(WGD)event in tea plants.Whether the WGD in the tea plant is shared with other plants in order Ericales and how it contributed to tea plant evolution remained unanswered.Here we re-analyzed the tea plant genome and provided evidence that tea experienced only WGD event after the core-eudicot whole-genome triplication(WGT)event.This WGD was shared by the Polemonioids-Primuloids-Core Ericales(PPC)sections,encompassing at least 17 families in the order Ericales.In addition,our study identified eight pairs of duplicated genes in the catechins biosynthesis pathway,four pairs of duplicated genes in the theanine biosynthesis pathway,and one pair of genes in the caffeine biosynthesis pathway,which were expanded and retained following this WGD.Nearly all these gene pairs were expressed in tea plants,implying the contribution of the WGD.This study shows that in addition to the role of the recent tandem gene duplication in the accumulation of tea flavor-related genes,the WGD may have been another main factor driving the evolution of tea flavor.展开更多
Hazelnut is popular for its flavor,and it has also been suggested that hazelnut is beneficial to cardiovascular health because it is rich in oleic acid.Here,we report the first high-quality chromosome-scale genome for...Hazelnut is popular for its flavor,and it has also been suggested that hazelnut is beneficial to cardiovascular health because it is rich in oleic acid.Here,we report the first high-quality chromosome-scale genome for the hazelnut species Corylus mandshurica(2n=22),which has a high concentration of oleic acid in its nuts.The assembled genome is 367.67Mb in length,and the contig N50 is 14.85 Mb.All contigs were assembled into 11 chromosomes,and 28,409 protein-coding genes were annotated.We reconstructed the evolutionary trajectories of the genomes of Betulaceae species and revealed that the 11 chromosomes of the hazelnut genus were derived from the most ancestral karyotype in Betula pendula,which has 14 protochromosomes,by inferring homology among five Betulaceae genomes.We identified 96 candidate genes involved in oleic acid biosynthesis,and 10 showed rapid evolution or positive selection.These findings will help us to understand the mechanisms of lipid synthesis and storage in hazelnuts.Several gene families related to salicylic acid metabolism and stress responses experienced rapid expansion in this hazelnut species,which may have increased its stress tolerance.The reference genome presented here constitutes a valuable resource for molecular breeding and genetic improvement of the important agronomic properties of hazelnut.展开更多
Angiosperms dominate the Earth’s ecosystems and provide most of the basic necessities for human life.The major angiosperm clades comprise 64 orders,as recognized by the APGⅣclassification.However,the phylogenetic re...Angiosperms dominate the Earth’s ecosystems and provide most of the basic necessities for human life.The major angiosperm clades comprise 64 orders,as recognized by the APGⅣclassification.However,the phylogenetic relationships of angiosperms remain unclear,as phylogenetic trees with different topologies have been reconstructed depending on the sequence datasets utilized,from targeted genes to transcriptomes.Here,we used currently available de novo genome data to reconstruct the phylogenies of 366 angiosperm species from 241 genera belonging to 97 families across 43 of the 64 orders based on orthologous genes from the nuclear,plastid,and mitochondrial genomes of the same species with compatible datasets.The phylogenetic relationships were largely consistent with previously constructed phylogenies based on sequence variations in each genome type.However,there were major inconsistencies in the phylogenetic relationships of the five Mesangiospermae lineages when different genomes were examined.We discuss ways to address these inconsistencies,which could ultimately lead to the reconstruction of a comprehensive angiosperm tree of life.The angiosperm phylogenies presented here provide a basic framework for further updates and comparisons.These phylogenies can also be used as guides to examine the evolutionary trajectories among the three genome types during lineage radiation.展开更多
Evidence of whole-genome duplications(WGDs)and subsequent karyotype changes has been detected in most major lineages of living organisms on Earth.To clarify the complex resulting multi-layered patterns of gene colline...Evidence of whole-genome duplications(WGDs)and subsequent karyotype changes has been detected in most major lineages of living organisms on Earth.To clarify the complex resulting multi-layered patterns of gene collinearity in genome analyses,there is a need for convenient and accurate toolkits.To meet this need,we developed WGDI(Whole-Genome Duplication Integrated analysis),a Python-based command-line tool that facilitates comprehensive analysis of recursive polyploidization events and cross-species genome alignments.WGDI supports three main workflows(polyploid inference,hierarchical inference of genomic homology,and ancestral chromosome karyotyping)that can improve the detection of WGD and characterization of WGD-related events based on high-quality chromosome-level genomes.Significantly,it can extract complete synteny blocks and facilitate reconstruction of detailed karyotype evolution.This toolkit is freely available at GitHub(https://github.com/SunPengChuan/wgdi).As an example of its application,WGDI convincingly clarified karyotype evolution in Aquilegia coerulea and Vitis vinifera following WGDs and rejected the hypothesis that Aquilegia contributed as a parental lineage to the allopolyploid origin of core dicots.展开更多
Lycophytes and seed plants constitute the typical vascular plants.Lycophytes have been thought to have no paleo-polyploidization although the event is known to be critical for the fast expansion of seed plants.Here,ge...Lycophytes and seed plants constitute the typical vascular plants.Lycophytes have been thought to have no paleo-polyploidization although the event is known to be critical for the fast expansion of seed plants.Here,genomic analyses including the homologous gene dot plot analysis detected multiple paleo-polyploidization events,with one occurring approximately 13–15 million years ago(MYA)and another about 125–142 MYA,during the evolution of the genome of Selaginella moellendorffii,a model lycophyte.In addition,comparative analysis of reconstructed ancestral genomes of lycophytes and angiosperms suggested that lycophytes were affected by more paleopolyploidization events than seed plants.Results from the present genomic analyses indicate that paleo-polyploidization has contributed to the successful establishment of both lineages—lycophytes and seed plants—of vascular plants.展开更多
基金funding from the National Natural Science Foundation of China(32172614)a startup fund fromHainan University and a Hainan Province Science and Technology Special Fund(ZDYF2023XDNY050).
文摘Fragaria vesca,commonly known as wild or woodland strawberry,is the most widely distributed diploid Fragaria species and is native to Europe and Asia.Because of its small plant size,low heterozygosity,and relative ease of genetic transformation,F.vesca has been a model plant for fruit research since the publication of its Illumina-based genome in 2011.However,its genomic contribution to octoploid cultivated strawberry remains a long-standing question.Here,we de novo assembled and annotated a telomere-to-telomere,gap-free genome of F.vesca‘Hawaii 4’,with all seven chromosomes assembled into single contigs,providing the highest completeness and assembly quality to date.The gap-free genome is 220785082 bp in length and encodes 36173 protein-coding gene models,including 1153 newly annotated genes.All 14 telomeres and seven centromeres were annotated within the seven chromosomes.Among the three previously recognized wild diploid strawberry ancestors,F.vesca,F.iinumae,and F.viridis,phylogenomic analysis showed that F.vesca and F.viridis are the ancestors of the cultivated octoploid strawberry F.×ananassa,and F.vesca is its closest relative.Three subgenomes of F.×ananassa belong to the F.vesca group,and one is sister to F.viridis.We anticipate that this high-quality,telomere-to-telomere,gap-free F.vesca genome,combined with our phylogenomic inference of the origin of cultivated strawberry,will provide insight into the genomic evolution of Fragaria and facilitate strawberry genetics and molecular breeding.
基金supported by the National Natural Science Foundation of China(31801856 to X.S.)the Hebei Province Higher Education Youth Talents Program(BJ2018016 to X.S.)+1 种基金China-Hebei 100 Scholars Supporting Project(E2013100003 to X.W.)the Natural Science Foundation of Hebei(C2017209103 to X.S.).
文摘Coriander(Coriandrum sativum L.),also known as cilantro,is a globally important vegetable and spice crop.Its genome and that of carrot are models for studying the evolution of the Apiaceae family.Here,we developed the Coriander Genomics Database(CGDB,http://cgdb.bio2db.com/)to collect,store,and integrate the genomic,transcriptomic,metabolic,functional annotation,and repeat sequence data of coriander and carrot to serve as a central online platform for Apiaceae and other related plants.Using these data sets in the CGDB,we intriguingly found that seven transcription factor(TF)families showed significantly greater numbers of members in the coriander genome than in the carrot genome.The highest ratio of the numbers of MADS TFs between coriander and carrot reached 3.15,followed by those for tubby protein(TUB)and heat shock factors.As a demonstration of CGDB applications,we identified 17 TUB family genes and conducted systematic comparative and evolutionary analyses.RNA-seq data deposited in the CGDB also suggest dose compensation effects of gene expression in coriander.CGDB allows bulk downloading,significance searches,genome browser analyses,and BLAST searches for comparisons between coriander and other plants regarding genomics,gene families,gene collinearity,gene expression,and the metabolome.A detailed user manual and contact information are also available to provide support to the scientific research community and address scientific questions.CGDB will be continuously updated,and new data will be integrated for comparative and functional genomic analysis in Apiaceae and other related plants.
基金This research was supported by the national Natural Science Foundation of China(31972460 and 31801898)This research was supported by the earmarked fund for the China Agriculture Research System(CARS-19)+2 种基金the key Research and Development Program of Jiangsu Province(BE2019379)This work was supported by the high-performance computing platform of the Bioinformatics Center,Nanjing Agricultural University.F.C.is supported by a start-up fund(804012)from Nanjing Agricultural Universitythe Fundamental Research Funds for the Central Universities(KYXJ202004).
文摘Tea,coffee,and cocoa are the three most popular nonalcoholic beverages in the world and have extremely high economic and cultural value.The genomes of four tea plant varieties have recently been sequenced,but there is some debate regarding the characterization of a whole-genome duplication(WGD)event in tea plants.Whether the WGD in the tea plant is shared with other plants in order Ericales and how it contributed to tea plant evolution remained unanswered.Here we re-analyzed the tea plant genome and provided evidence that tea experienced only WGD event after the core-eudicot whole-genome triplication(WGT)event.This WGD was shared by the Polemonioids-Primuloids-Core Ericales(PPC)sections,encompassing at least 17 families in the order Ericales.In addition,our study identified eight pairs of duplicated genes in the catechins biosynthesis pathway,four pairs of duplicated genes in the theanine biosynthesis pathway,and one pair of genes in the caffeine biosynthesis pathway,which were expanded and retained following this WGD.Nearly all these gene pairs were expressed in tea plants,implying the contribution of the WGD.This study shows that in addition to the role of the recent tandem gene duplication in the accumulation of tea flavor-related genes,the WGD may have been another main factor driving the evolution of tea flavor.
基金the Strategic Priority Research Program of the Chinese Academy of Sciences(XDB31010300)the National Key Research and Development Program of China(2017YFC0505203)the National Natural Science Foundation of China(31590821 and 31900201)。
文摘Hazelnut is popular for its flavor,and it has also been suggested that hazelnut is beneficial to cardiovascular health because it is rich in oleic acid.Here,we report the first high-quality chromosome-scale genome for the hazelnut species Corylus mandshurica(2n=22),which has a high concentration of oleic acid in its nuts.The assembled genome is 367.67Mb in length,and the contig N50 is 14.85 Mb.All contigs were assembled into 11 chromosomes,and 28,409 protein-coding genes were annotated.We reconstructed the evolutionary trajectories of the genomes of Betulaceae species and revealed that the 11 chromosomes of the hazelnut genus were derived from the most ancestral karyotype in Betula pendula,which has 14 protochromosomes,by inferring homology among five Betulaceae genomes.We identified 96 candidate genes involved in oleic acid biosynthesis,and 10 showed rapid evolution or positive selection.These findings will help us to understand the mechanisms of lipid synthesis and storage in hazelnuts.Several gene families related to salicylic acid metabolism and stress responses experienced rapid expansion in this hazelnut species,which may have increased its stress tolerance.The reference genome presented here constitutes a valuable resource for molecular breeding and genetic improvement of the important agronomic properties of hazelnut.
基金supported by the Strategic Priority Research Program of the Chinese Academy of Sciences(XDB31000000 to J.L.and Y.Y.)the PhD Programs Foundation of the Department of Education of Gansu(2021QB007 to Y.Y.)+2 种基金the Science Fund for Creative Research Groups of Gansu Province(21JR7RA533 to Y.Y.)the Young Talent Development Project of the State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystems(2021+02 to Y.Y.)the International Collaboration 111 Programme(BP0719040)。
文摘Angiosperms dominate the Earth’s ecosystems and provide most of the basic necessities for human life.The major angiosperm clades comprise 64 orders,as recognized by the APGⅣclassification.However,the phylogenetic relationships of angiosperms remain unclear,as phylogenetic trees with different topologies have been reconstructed depending on the sequence datasets utilized,from targeted genes to transcriptomes.Here,we used currently available de novo genome data to reconstruct the phylogenies of 366 angiosperm species from 241 genera belonging to 97 families across 43 of the 64 orders based on orthologous genes from the nuclear,plastid,and mitochondrial genomes of the same species with compatible datasets.The phylogenetic relationships were largely consistent with previously constructed phylogenies based on sequence variations in each genome type.However,there were major inconsistencies in the phylogenetic relationships of the five Mesangiospermae lineages when different genomes were examined.We discuss ways to address these inconsistencies,which could ultimately lead to the reconstruction of a comprehensive angiosperm tree of life.The angiosperm phylogenies presented here provide a basic framework for further updates and comparisons.These phylogenies can also be used as guides to examine the evolutionary trajectories among the three genome types during lineage radiation.
基金This work was supported equally by the Strategic Priority Research Program of the Chinese Academy of Sciences(XDB31000000)the National Natural Science Foundation of China(grant numbers 31590821 and 91731301 to J.L.and 32070669to X.W.)+1 种基金the National Key Research and Development Program of China(2017YFC0505203 to Z.X.)also by the Fundamental Research Funds for the Central Universities(SCU2019D013 and 2020SCUNL207)and theNational High-Level Talents Special Support Plan(10 Thousand People Plan)。
文摘Evidence of whole-genome duplications(WGDs)and subsequent karyotype changes has been detected in most major lineages of living organisms on Earth.To clarify the complex resulting multi-layered patterns of gene collinearity in genome analyses,there is a need for convenient and accurate toolkits.To meet this need,we developed WGDI(Whole-Genome Duplication Integrated analysis),a Python-based command-line tool that facilitates comprehensive analysis of recursive polyploidization events and cross-species genome alignments.WGDI supports three main workflows(polyploid inference,hierarchical inference of genomic homology,and ancestral chromosome karyotyping)that can improve the detection of WGD and characterization of WGD-related events based on high-quality chromosome-level genomes.Significantly,it can extract complete synteny blocks and facilitate reconstruction of detailed karyotype evolution.This toolkit is freely available at GitHub(https://github.com/SunPengChuan/wgdi).As an example of its application,WGDI convincingly clarified karyotype evolution in Aquilegia coerulea and Vitis vinifera following WGDs and rejected the hypothesis that Aquilegia contributed as a parental lineage to the allopolyploid origin of core dicots.
基金the Ministry of Science and Technology of the People’s Republic of China(Grant No.2016YFD0101001)the China National Science Foundation(Grant Nos.31371282 to XW,31510333 to JW,and 31661143009 to XW)+1 种基金the Natural Science Foundation of Hebei Province(Grant No.C2015209069 to JW)Tangshan Key Laboratory Project to XW。
文摘Lycophytes and seed plants constitute the typical vascular plants.Lycophytes have been thought to have no paleo-polyploidization although the event is known to be critical for the fast expansion of seed plants.Here,genomic analyses including the homologous gene dot plot analysis detected multiple paleo-polyploidization events,with one occurring approximately 13–15 million years ago(MYA)and another about 125–142 MYA,during the evolution of the genome of Selaginella moellendorffii,a model lycophyte.In addition,comparative analysis of reconstructed ancestral genomes of lycophytes and angiosperms suggested that lycophytes were affected by more paleopolyploidization events than seed plants.Results from the present genomic analyses indicate that paleo-polyploidization has contributed to the successful establishment of both lineages—lycophytes and seed plants—of vascular plants.