A novel RNA virus,the severe acute respiratory syndrome coronavirus 2(SARS-CoV-2),is responsible for the ongoing outbreak of coronavirus disease 2019(COVID-19).Population genetic analysis could be useful for investiga...A novel RNA virus,the severe acute respiratory syndrome coronavirus 2(SARS-CoV-2),is responsible for the ongoing outbreak of coronavirus disease 2019(COVID-19).Population genetic analysis could be useful for investigating the origin and evolutionary dynamics of COVID-19.However,due to extensive sampling bias and existence of infection clusters during the epidemic spread,direct applications of existing approaches can lead to biased parameter estimations and data misinterpretation.In this study,we first present robust estimator for the time to the most recent common ancestor(TMRCA)and the mutation rate,and then apply the approach to analyze 12,909 genomic sequences of SARS-CoV-2.The mutation rate is inferred to be 8.69×10^(−4) per site per year with a 95%confidence interval(CI)of[8.61×10^(−4),8.77×10^(−4)],and the TMRCA of the samples inferred to be Nov 28,2019 with a 95%CI of[Oct 20,2019,Dec 9,2019].The results indicate that COVID-19 might originate earlier than and outside of Wuhan Seafood Market.We further demonstrate that genetic polymorphism patterns,including the enrichment of specific haplotypes and the temporal allele frequency trajectories generated from infection clusters,are similar to those caused by evolutionary forces such as natural selection.Our results show that population genetic methods need to be developed to efficiently detangle the effects of sampling bias and infection clusters to gain insights into the evolutionary mechanism of SARS-CoV-2.Software for implementing VirusMuT can be downloaded at https://bigd.big.ac.cn/biocode/tools/BT007081.展开更多
By re-analzying public metagenomic data from 101 patients infected with influenza A virus during the 2007–2012 H1N1 flu seasons in France,we identified 22 samples with SARS-CoV sequences.In three of them,the SARS gen...By re-analzying public metagenomic data from 101 patients infected with influenza A virus during the 2007–2012 H1N1 flu seasons in France,we identified 22 samples with SARS-CoV sequences.In three of them,the SARS genome sequences could be fully assembled out of each.These sequences are highly similar(99.99%and 99.70%)to the artificially constructed recombinant SARS-CoV(SARSr-CoV)strains generated by the J.Craig Venter Institute in the USA.Moreover,samples from different flu seasons have different SARS-CoV strains,and the divergence between these strains cannot be explained by natural evolution.Our study also shows that retrospective studies using public metagenomic data from past major epidemic outbreaks serve as a genomic strategy for the research of the origins or spread of infectious diseases.展开更多
基金This study was supported by the National Key R&D Program of China(Grant No.2020YFC0847000)the National Natural Science Foundation of China(Grant Nos.31571370,91731302,and 31772435).
文摘A novel RNA virus,the severe acute respiratory syndrome coronavirus 2(SARS-CoV-2),is responsible for the ongoing outbreak of coronavirus disease 2019(COVID-19).Population genetic analysis could be useful for investigating the origin and evolutionary dynamics of COVID-19.However,due to extensive sampling bias and existence of infection clusters during the epidemic spread,direct applications of existing approaches can lead to biased parameter estimations and data misinterpretation.In this study,we first present robust estimator for the time to the most recent common ancestor(TMRCA)and the mutation rate,and then apply the approach to analyze 12,909 genomic sequences of SARS-CoV-2.The mutation rate is inferred to be 8.69×10^(−4) per site per year with a 95%confidence interval(CI)of[8.61×10^(−4),8.77×10^(−4)],and the TMRCA of the samples inferred to be Nov 28,2019 with a 95%CI of[Oct 20,2019,Dec 9,2019].The results indicate that COVID-19 might originate earlier than and outside of Wuhan Seafood Market.We further demonstrate that genetic polymorphism patterns,including the enrichment of specific haplotypes and the temporal allele frequency trajectories generated from infection clusters,are similar to those caused by evolutionary forces such as natural selection.Our results show that population genetic methods need to be developed to efficiently detangle the effects of sampling bias and infection clusters to gain insights into the evolutionary mechanism of SARS-CoV-2.Software for implementing VirusMuT can be downloaded at https://bigd.big.ac.cn/biocode/tools/BT007081.
基金supported by the National Key R&D Program of China(2021YFC0863400)the Key Program of Chinese Academy of Sciences(KJZD-SW-L14)the National Natural Science Foundation of China(Grant No.31571370 and 91731302).
文摘By re-analzying public metagenomic data from 101 patients infected with influenza A virus during the 2007–2012 H1N1 flu seasons in France,we identified 22 samples with SARS-CoV sequences.In three of them,the SARS genome sequences could be fully assembled out of each.These sequences are highly similar(99.99%and 99.70%)to the artificially constructed recombinant SARS-CoV(SARSr-CoV)strains generated by the J.Craig Venter Institute in the USA.Moreover,samples from different flu seasons have different SARS-CoV strains,and the divergence between these strains cannot be explained by natural evolution.Our study also shows that retrospective studies using public metagenomic data from past major epidemic outbreaks serve as a genomic strategy for the research of the origins or spread of infectious diseases.