Next-generation sequencing(NGS),represented by Illumina platforms,has been an essential cornerstone of basic and applied research.However,the sequencing error rate of 1 per 1000 bp(10^(−3))represents a serious hurdle ...Next-generation sequencing(NGS),represented by Illumina platforms,has been an essential cornerstone of basic and applied research.However,the sequencing error rate of 1 per 1000 bp(10^(−3))represents a serious hurdle for research areas focusing on rare mutations,such as somatic mosaicism or microbe heterogeneity.By examining the high-fidelity sequencing methods developed in the past decade,we summarized three major factors underlying errors and the corresponding 12 strategies mitigating these errors.We then proposed a novel framework to classify 11 preexisting representative methods according to the corresponding combinatory strategies and identified three trends that emerged during methodological developments.We further extended this analysis to eight long-read sequencing methods,emphasizing error reduction strategies.Finally,we suggest two promising future directions that could achieve comparable or even higher accuracy with lower costs in both NGS and long-read sequencing.展开更多
The recent breakthroughs in next-generation sequencing technologies, such as those of Roche 454,Illumina/Solexa, and ABI SOLID, have dramatically reduced the cost of producing short reads of the genome of new species....The recent breakthroughs in next-generation sequencing technologies, such as those of Roche 454,Illumina/Solexa, and ABI SOLID, have dramatically reduced the cost of producing short reads of the genome of new species. The huge volume of reads, along with short read length, high coverage, and sequencing errors, poses a great challenge to de novo genome assembly. However, the paired-end information provides a new solution to these problems. In this paper, we review and compare some current assembly tools, including Newbler, CAP3, Velvet,SOAPdenovo, AllPaths, Abyss, IDBA, PE-Assembly, and Telescoper. In general, we compare the seed extension and graph-based methods that use the overlap/lapout/consensus approach and the de Bruijn graph approach for assembly. At the end of the paper, we summarize these methods and discuss the future directions of genome assembly.展开更多
基金supported by the Ministry of Agriculture and Rural Affairs of China,the National Key R&D Program of China(Grant No.2019YFA0802600)the Chinese Academy of Sciences(Grant Nos.ZDBS-LY-SM005 and XDPB17)the National Natural Science Foundation of China(Grant No.31970565).
文摘Next-generation sequencing(NGS),represented by Illumina platforms,has been an essential cornerstone of basic and applied research.However,the sequencing error rate of 1 per 1000 bp(10^(−3))represents a serious hurdle for research areas focusing on rare mutations,such as somatic mosaicism or microbe heterogeneity.By examining the high-fidelity sequencing methods developed in the past decade,we summarized three major factors underlying errors and the corresponding 12 strategies mitigating these errors.We then proposed a novel framework to classify 11 preexisting representative methods according to the corresponding combinatory strategies and identified three trends that emerged during methodological developments.We further extended this analysis to eight long-read sequencing methods,emphasizing error reduction strategies.Finally,we suggest two promising future directions that could achieve comparable or even higher accuracy with lower costs in both NGS and long-read sequencing.
基金supported in part by the National Natural Science Foundation of China (Nos.61232001,61128006,and 61073036)
文摘The recent breakthroughs in next-generation sequencing technologies, such as those of Roche 454,Illumina/Solexa, and ABI SOLID, have dramatically reduced the cost of producing short reads of the genome of new species. The huge volume of reads, along with short read length, high coverage, and sequencing errors, poses a great challenge to de novo genome assembly. However, the paired-end information provides a new solution to these problems. In this paper, we review and compare some current assembly tools, including Newbler, CAP3, Velvet,SOAPdenovo, AllPaths, Abyss, IDBA, PE-Assembly, and Telescoper. In general, we compare the seed extension and graph-based methods that use the overlap/lapout/consensus approach and the de Bruijn graph approach for assembly. At the end of the paper, we summarize these methods and discuss the future directions of genome assembly.