The transition of traits between genetically related lineages is a fascinating topic that provides clues to understanding the drivers of speciation and diversification.Much can be learned about this process from phylo...The transition of traits between genetically related lineages is a fascinating topic that provides clues to understanding the drivers of speciation and diversification.Much can be learned about this process from phylogeny-based trait evolution.However,such inference is often plagued by genome-wide gene-tree discordance(GTD),mostly due to incomplete lineage sorting(ILS)and/or introgressive hybridization,especially when the genes underlying the traits appear discordant.Here,by collecting transcriptomes,whole chloroplast genomes(cpDNA),and population genetic datasets,we used the coalescent model to turn GTD into a source of information for ILS and employed hemiplasy to explain specific cases of apparent“phylogenetic discordance”between different morphological traits and probable species phylogeny in the Allium subg.Cyathophora.Both concatenation and coalescence methods consistently showed the same phylogenetic topology for species tree inference based on single-copy genes(SCGs),as supported by the KS distribution.However,GTD was high across the genomes of subg.Cyathophora:~27%e38.9%of the SCG trees were in conflict with the species tree.Plasmid and nuclear incongruence was also present.Our coalescent simulations indicated that such GTD was mainly a product of ILS.Our hemiplasy risk factor calculations supported that random fixation of ancient polymorphisms in different populations during successive speciation events along the subg.Cyathophora phylogeny may have caused the character transition,as well as the anomalous cpDNA tree.Our study exemplifies how phylogenetic noise can be transformed into evolutionary information for understanding character state transitions along species phylogenies.展开更多
Cotton(Gossypium)stands as a crucial economic crop,serving as the primary source of naturalfiber for the textile sector.However,the evolutionary mechanisms driving speciation within the Gossypium genus remain unresolv...Cotton(Gossypium)stands as a crucial economic crop,serving as the primary source of naturalfiber for the textile sector.However,the evolutionary mechanisms driving speciation within the Gossypium genus remain unresolved.In this investigation,we leveraged 25 Gossypium genomes and introduced four novel assem-blies—G.harknessii,G.gossypioides,G.trilobum,and G.klotzschianum(Gklo)—to delve into the speciation history of this genus.Notably,we encountered intricate phylogenies potentially stemming from introgres-sion.These complexities are further compounded by incomplete lineage sorting(ILS),a factor likely to have been instrumental in shaping the swift diversification of cotton.Our focus subsequently shifted to the rapid radiation episode during a concise period in Gossypium evolution.For a recently diverged lineage comprising G.davidsonii,Gklo,and G.raimondii,we constructed afinely detailed ILS map.Intriguingly,this analysis revealed the non-random distribution of ILS regions across the reference Gklo genome.Moreover,we identified signs of robust natural selection influencing specific ILS regions.Noteworthy variations per-taining to speciation emerged between the closely related sister species Gklo and G.davidsonii.Approxi-mately 15.74%of speciation structural variation genes and 12.04%of speciation-associated genes were esti-mated to intersect with ILS signatures.Thesefindings enrich our understanding of the role of ILS in adaptive radiation,shedding fresh light on the intricate speciation history of the Gossypium genus.展开更多
Although the effects of the coalescent process on sequence divergence and genealogies are well understood, the vir- tual majority of studies that use molecular sequences to estimate times of divergence among species h...Although the effects of the coalescent process on sequence divergence and genealogies are well understood, the vir- tual majority of studies that use molecular sequences to estimate times of divergence among species have failed to account for the coalescent process. Here we study the impact of ancestral population size and incomplete lineage sorting on Bayesian estimates of species divergence times under the molecular clock when the inference model ignores the coalescent process. Using a combination of mathematical analysis, computer simulations and analysis of real data, we find that the errors on estimates of times and the molecular rate can be substantial when ancestral populations are large and when there is substantial incomplete lineage sorting. For example, in a simple three-species case, we find that if the most precise fossil calibration is placed on the root of the phylogeny, the age of the internal node is overestimated, while if the most precise calibration is placed on the internal node, then the age of the root is underestimated. In both cases, the molecular rate is overestimated. Using simulations on a phylogeny of nine species, we show that substantial errors in time and rate estimates can be obtained even when dating ancient divergence events. We analyse the hominoid phylogeny and show that estimates of the neutral mutation rate obtained while ignoring the coalescent are too high. Using a coalescent-based technique to obtain geological times of divergence, we obtain estimates of the mutation rate that are within experimental estimates and we also obtain substantially older divergence times within the phylogeny [Current Zoology 61 (5): 874-885, 2015].展开更多
Environmentally heterogeneous mountains provide opportunities for rapid diversification and speciation.The family Prunellidae(accentors)is a group of birds comprising primarily mountain specialists that have recently ...Environmentally heterogeneous mountains provide opportunities for rapid diversification and speciation.The family Prunellidae(accentors)is a group of birds comprising primarily mountain specialists that have recently radiated across the Palearctic region.This rapid diversification poses challenges to resolving their phylogeny.Herein we sequenced the complete mitogenomes and estimated the phylogeny using all 12(including 28 individuals)currently recognized species of Prunellidae.We reconstructed the mitochondrial genome phylogeny using 13 protein-coding genes of 12 species and 2 Eurasian Tree Sparrows(Passer montanus).Phylogenetic relationships were estimated using a suite of analyses:maximum likelihood,maximum parsimony and the coalescent-based SVDquartets.Divergence times were estimated by implementing a Bayesian relaxed clock model in BEAST2.Based on the BEAST time-calibrated tree,we implemented an ancestral area reconstruction using RASP v.4.3.Our phylogenies based on the maximum likelihood,maximum parsimony and SVDquartets approaches support a clade of large-sized accentors(subgenus Laiscopus)to be sister to all other accentors with small size(subgenus Prunella).In addition,the trees also support the sister relationship of P.immaculata and P.rubeculoides+P.atrogularis with 100%bootstrap support,but the relationships among the remaining eight species in the Prunella clade are poorly resolved.These species cluster in different positions in the three phylogenetic trees and the nodes are often poorly supported.The five nodes separating the seven species diverged simultaneously within less than half million years(i.e.,between 2.71 and 3.15 million years ago),suggesting that the recent radiation is likely responsible for rampant incomplete lineage sorting and gene tree conflicts.Ancestral area reconstruction indicates a central Palearctic region origin for Prunellidae.Our study highlights that whole mitochondrial genome phylogeny can resolve major lineages within Prunellidae but is not sufficient to fully resolve the relationship among the species in the Prunella clade that almost simultaneously diversify during a short time period.Our results emphasize the challenge to reconstruct reliable phylogenetic relationship in a group of recently radiated species.展开更多
The advances accelerated by next-generation sequencing and long-read sequencing technologies continue to provide an impetus for plant phylogenetic study.In the past decade,a large number of phylogenetic studies adopti...The advances accelerated by next-generation sequencing and long-read sequencing technologies continue to provide an impetus for plant phylogenetic study.In the past decade,a large number of phylogenetic studies adopting hundreds to thousands of genes across a wealth of clades have emerged and ushered plant phylogenetics and evolution into a new era.In the meantime,a roadmap for researchers when making decisions across different approaches for their phylogenomic research design is imminent.This review focuses on the utility of genomic data(from organelle genomes,to both reduced representation sequencing and whole-genome sequencing) in phylogenetic and evolutionary investigations,describes the baseline methodology of experimental and analytical procedures,and summarizes recent progress in flowering plant phylogenomics at the ordinal,familial,tribal,and lower levels.We also discuss the challenges,such as the adverse impact on orthology inference and phylogenetic reconstruction raised from systematic errors,and underlying biological factors,such as whole-genome duplication,hybridization/introgression,and incomplete lineage sorting,together suggesting that a bifurcating tree may not be the best model for the tree of life.Finally,we discuss promising avenues for future plant phylogenomic studies.展开更多
Redwood trees(Sequoioideae),including Metasequoia glyptostroboides(dawn redwood),Sequoiadendron giganteum(giant sequoia),and Sequoia sempervirens(coast redwood),are threatened and widely recognized iconic tree species...Redwood trees(Sequoioideae),including Metasequoia glyptostroboides(dawn redwood),Sequoiadendron giganteum(giant sequoia),and Sequoia sempervirens(coast redwood),are threatened and widely recognized iconic tree species.Genomic resources for redwood trees could provide clues to their evolutionary relationships.Here,we report the 8-Gb reference genome of M.glyptostroboides and a comparative analysis with two related species.More than 62%of the M.glyptostroboides genome is composed of repetitive sequences.Clade-specific bursts of long terminal repeat retrotransposons may have contributed to genomic differentiation in the three species.The chromosomal synteny between M.glyptostroboides and S.giganteum is extremely high,whereas there has been significant chromosome reorganization in S.sempervirens.Phylogenetic analysis of marker genes indicates that S.sempervirens is an autopolyploid,and more than 48%of the gene trees are incongruent with the species tree.Results of multiple analyses suggest that incomplete lineage sorting(ILS)rather than hybridization explains the inconsistent phylogeny,indicating that genetic variation among redwoods may be due to random retention of polymorphisms in ancestral populations.Functional analysis of ortholog groups indicates that gene families of ion channels,tannin biosynthesis enzymes,and transcription factors for meristem maintenance have expanded in S.giganteum and S.sempervirens,which is consistent with their extreme height.As a wetland-tolerant species,M.glyptostroboides shows a transcriptional response to flooding stress that is conserved with that of analyzed angiosperm species.Our study offers insights into redwood evolution and adaptation and provides genomic resources to aid in their conservation and management.展开更多
Background: Genetic admixture refers to the process or consequence of interbreeding between two or more previously isolated populations within a species. Compared to many other evolutionary driving forces such as mut...Background: Genetic admixture refers to the process or consequence of interbreeding between two or more previously isolated populations within a species. Compared to many other evolutionary driving forces such as mutations, genetic drift, and natural selection, genetic admixture is a quick mechanism for shaping population genomie diversity. In particular, admixture results in "recombination" of genetic variants that have been fixed in different populations, which has many evolutionary and medical implications. Results: However, it is challenging to accurately reconstruct population admixture history and to understand of population admixture dynamics. In this review, we provide an overview of models, methods, and tools for ancestry inference and admixture analysis. Conclusions: Many methods and tools used for admixture analysis were originally developed to analyze human data, but these methods can also be directly applied and/or slightly modified to study non-human species as well.展开更多
基金supported by the Key Science & Technology Project of Gansu Province (22ZD6NA007)the National Key Research and Development Program of China (2021YFD2200202)Computing support was provided by the Supercomputing Center of Lanzhou University
文摘The transition of traits between genetically related lineages is a fascinating topic that provides clues to understanding the drivers of speciation and diversification.Much can be learned about this process from phylogeny-based trait evolution.However,such inference is often plagued by genome-wide gene-tree discordance(GTD),mostly due to incomplete lineage sorting(ILS)and/or introgressive hybridization,especially when the genes underlying the traits appear discordant.Here,by collecting transcriptomes,whole chloroplast genomes(cpDNA),and population genetic datasets,we used the coalescent model to turn GTD into a source of information for ILS and employed hemiplasy to explain specific cases of apparent“phylogenetic discordance”between different morphological traits and probable species phylogeny in the Allium subg.Cyathophora.Both concatenation and coalescence methods consistently showed the same phylogenetic topology for species tree inference based on single-copy genes(SCGs),as supported by the KS distribution.However,GTD was high across the genomes of subg.Cyathophora:~27%e38.9%of the SCG trees were in conflict with the species tree.Plasmid and nuclear incongruence was also present.Our coalescent simulations indicated that such GTD was mainly a product of ILS.Our hemiplasy risk factor calculations supported that random fixation of ancient polymorphisms in different populations during successive speciation events along the subg.Cyathophora phylogeny may have caused the character transition,as well as the anomalous cpDNA tree.Our study exemplifies how phylogenetic noise can be transformed into evolutionary information for understanding character state transitions along species phylogenies.
基金the National Natural Science Foundation of China (32272090,32171994,and 32072023)the Central Plains Science and Technology Innovation Leader Project (214200510029 and 2022C01NY001)+1 种基金the Project of Sanya Yazhou Bay Science and Technology City (SCKY-JYRC-2022-88)the National Key R&D Program of China (2021YFE0101200)for financial support.
文摘Cotton(Gossypium)stands as a crucial economic crop,serving as the primary source of naturalfiber for the textile sector.However,the evolutionary mechanisms driving speciation within the Gossypium genus remain unresolved.In this investigation,we leveraged 25 Gossypium genomes and introduced four novel assem-blies—G.harknessii,G.gossypioides,G.trilobum,and G.klotzschianum(Gklo)—to delve into the speciation history of this genus.Notably,we encountered intricate phylogenies potentially stemming from introgres-sion.These complexities are further compounded by incomplete lineage sorting(ILS),a factor likely to have been instrumental in shaping the swift diversification of cotton.Our focus subsequently shifted to the rapid radiation episode during a concise period in Gossypium evolution.For a recently diverged lineage comprising G.davidsonii,Gklo,and G.raimondii,we constructed afinely detailed ILS map.Intriguingly,this analysis revealed the non-random distribution of ILS regions across the reference Gklo genome.Moreover,we identified signs of robust natural selection influencing specific ILS regions.Noteworthy variations per-taining to speciation emerged between the closely related sister species Gklo and G.davidsonii.Approxi-mately 15.74%of speciation structural variation genes and 12.04%of speciation-associated genes were esti-mated to intersect with ILS signatures.Thesefindings enrich our understanding of the role of ILS in adaptive radiation,shedding fresh light on the intricate speciation history of the Gossypium genus.
文摘Although the effects of the coalescent process on sequence divergence and genealogies are well understood, the vir- tual majority of studies that use molecular sequences to estimate times of divergence among species have failed to account for the coalescent process. Here we study the impact of ancestral population size and incomplete lineage sorting on Bayesian estimates of species divergence times under the molecular clock when the inference model ignores the coalescent process. Using a combination of mathematical analysis, computer simulations and analysis of real data, we find that the errors on estimates of times and the molecular rate can be substantial when ancestral populations are large and when there is substantial incomplete lineage sorting. For example, in a simple three-species case, we find that if the most precise fossil calibration is placed on the root of the phylogeny, the age of the internal node is overestimated, while if the most precise calibration is placed on the internal node, then the age of the root is underestimated. In both cases, the molecular rate is overestimated. Using simulations on a phylogeny of nine species, we show that substantial errors in time and rate estimates can be obtained even when dating ancient divergence events. We analyse the hominoid phylogeny and show that estimates of the neutral mutation rate obtained while ignoring the coalescent are too high. Using a coalescent-based technique to obtain geological times of divergence, we obtain estimates of the mutation rate that are within experimental estimates and we also obtain substantially older divergence times within the phylogeny [Current Zoology 61 (5): 874-885, 2015].
基金funded by the National Natural Science Foundation of China(NSFC32020103005)the Third Xinjiang Scientific Expedition and Research(XIKK)(2022xjkk0205)Second Tibetan Plateau Scientific Expedition and Research(2019QZKK0501)。
文摘Environmentally heterogeneous mountains provide opportunities for rapid diversification and speciation.The family Prunellidae(accentors)is a group of birds comprising primarily mountain specialists that have recently radiated across the Palearctic region.This rapid diversification poses challenges to resolving their phylogeny.Herein we sequenced the complete mitogenomes and estimated the phylogeny using all 12(including 28 individuals)currently recognized species of Prunellidae.We reconstructed the mitochondrial genome phylogeny using 13 protein-coding genes of 12 species and 2 Eurasian Tree Sparrows(Passer montanus).Phylogenetic relationships were estimated using a suite of analyses:maximum likelihood,maximum parsimony and the coalescent-based SVDquartets.Divergence times were estimated by implementing a Bayesian relaxed clock model in BEAST2.Based on the BEAST time-calibrated tree,we implemented an ancestral area reconstruction using RASP v.4.3.Our phylogenies based on the maximum likelihood,maximum parsimony and SVDquartets approaches support a clade of large-sized accentors(subgenus Laiscopus)to be sister to all other accentors with small size(subgenus Prunella).In addition,the trees also support the sister relationship of P.immaculata and P.rubeculoides+P.atrogularis with 100%bootstrap support,but the relationships among the remaining eight species in the Prunella clade are poorly resolved.These species cluster in different positions in the three phylogenetic trees and the nodes are often poorly supported.The five nodes separating the seven species diverged simultaneously within less than half million years(i.e.,between 2.71 and 3.15 million years ago),suggesting that the recent radiation is likely responsible for rampant incomplete lineage sorting and gene tree conflicts.Ancestral area reconstruction indicates a central Palearctic region origin for Prunellidae.Our study highlights that whole mitochondrial genome phylogeny can resolve major lineages within Prunellidae but is not sufficient to fully resolve the relationship among the species in the Prunella clade that almost simultaneously diversify during a short time period.Our results emphasize the challenge to reconstruct reliable phylogenetic relationship in a group of recently radiated species.
基金supported by the Priority Research Program of the Chinese Academy of Sciences (CAS) (Grant No.XDB31000000)Large-scale Scientific Facilities of the CAS (Grant No.2017LSF-GBOWS-2)。
文摘The advances accelerated by next-generation sequencing and long-read sequencing technologies continue to provide an impetus for plant phylogenetic study.In the past decade,a large number of phylogenetic studies adopting hundreds to thousands of genes across a wealth of clades have emerged and ushered plant phylogenetics and evolution into a new era.In the meantime,a roadmap for researchers when making decisions across different approaches for their phylogenomic research design is imminent.This review focuses on the utility of genomic data(from organelle genomes,to both reduced representation sequencing and whole-genome sequencing) in phylogenetic and evolutionary investigations,describes the baseline methodology of experimental and analytical procedures,and summarizes recent progress in flowering plant phylogenomics at the ordinal,familial,tribal,and lower levels.We also discuss the challenges,such as the adverse impact on orthology inference and phylogenetic reconstruction raised from systematic errors,and underlying biological factors,such as whole-genome duplication,hybridization/introgression,and incomplete lineage sorting,together suggesting that a bifurcating tree may not be the best model for the tree of life.Finally,we discuss promising avenues for future plant phylogenomic studies.
基金supported by the National Key Research and Development Program of China(2017YFD0600701).
文摘Redwood trees(Sequoioideae),including Metasequoia glyptostroboides(dawn redwood),Sequoiadendron giganteum(giant sequoia),and Sequoia sempervirens(coast redwood),are threatened and widely recognized iconic tree species.Genomic resources for redwood trees could provide clues to their evolutionary relationships.Here,we report the 8-Gb reference genome of M.glyptostroboides and a comparative analysis with two related species.More than 62%of the M.glyptostroboides genome is composed of repetitive sequences.Clade-specific bursts of long terminal repeat retrotransposons may have contributed to genomic differentiation in the three species.The chromosomal synteny between M.glyptostroboides and S.giganteum is extremely high,whereas there has been significant chromosome reorganization in S.sempervirens.Phylogenetic analysis of marker genes indicates that S.sempervirens is an autopolyploid,and more than 48%of the gene trees are incongruent with the species tree.Results of multiple analyses suggest that incomplete lineage sorting(ILS)rather than hybridization explains the inconsistent phylogeny,indicating that genetic variation among redwoods may be due to random retention of polymorphisms in ancestral populations.Functional analysis of ortholog groups indicates that gene families of ion channels,tannin biosynthesis enzymes,and transcription factors for meristem maintenance have expanded in S.giganteum and S.sempervirens,which is consistent with their extreme height.As a wetland-tolerant species,M.glyptostroboides shows a transcriptional response to flooding stress that is conserved with that of analyzed angiosperm species.Our study offers insights into redwood evolution and adaptation and provides genomic resources to aid in their conservation and management.
基金S.X. acknowledges financial support from the National Natural Science Foundation of China (NSFC) grant (Nos. 91331204 and 31711530221), the Strategic Priority Research Program (No. XDBI3040100) and Key Research Program of Frontier Sciences (No. QYZDJ-SSW-SYS009) of the Chinese Academy of Sciences (CAS), the National Science Fund for Distinguished Young Scholars (No. 31525014), and the Program of Shanghai Academic Research Leader (No. 16XD1404700) S.X. is Max-Planck Independent Research Group Leader and member of CAS Youth Innovation Promotion Association. S.X. also gratefully acknowledges the support of the National Program for Top-notch Young Innovative Talents of The "Wanren Jihua" Project. We thank LetPub (www.letpub.com) for its linguistic assistance during the preparation of this manuseript. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
文摘Background: Genetic admixture refers to the process or consequence of interbreeding between two or more previously isolated populations within a species. Compared to many other evolutionary driving forces such as mutations, genetic drift, and natural selection, genetic admixture is a quick mechanism for shaping population genomie diversity. In particular, admixture results in "recombination" of genetic variants that have been fixed in different populations, which has many evolutionary and medical implications. Results: However, it is challenging to accurately reconstruct population admixture history and to understand of population admixture dynamics. In this review, we provide an overview of models, methods, and tools for ancestry inference and admixture analysis. Conclusions: Many methods and tools used for admixture analysis were originally developed to analyze human data, but these methods can also be directly applied and/or slightly modified to study non-human species as well.