Sequence assembling is an important step for bioinformatics study.With the help of next generation sequencing(NGS)technology,high throughput DNA fragment(reads)can be randomly sampled from DNA or RNA molecular sequenc...Sequence assembling is an important step for bioinformatics study.With the help of next generation sequencing(NGS)technology,high throughput DNA fragment(reads)can be randomly sampled from DNA or RNA molecular sequence.However,as the positions of reads being sampled are unknown,assembling process is required for combining overlapped reads to reconstruct the original DNA or RNA sequence.Compared with traditional Sanger sequencing methods,although the throughput of NGS reads increases,the read length is shorter and the error rate is higher.It introduces several problems in assembling.Moreover,paired-end reads instead of single-end reads can be sampled which contain more information.The existing assemblers cannot fully utilize this information and fails to assemble longer contigs.In this article,we will revisit the major problems of assembling NGS reads on genomic,transcriptomic,metagenomic and metatranscriptomic data.We will also describe our IDBA package for solving these problems.IDBA package has adopted several novel ideas in assembling,including using multiple k,local assembling and progressive depth removal.Compared with existence assemblers,IDBA has better performance on many simulated and real sequencing datasets.展开更多
Next-generation sequencing(NGS) technologies have revolutionized the field of genomics and provided unprecedented opportunities for high-throughput analysis at the levels of genomics,transcriptomics and epigenetics.Ho...Next-generation sequencing(NGS) technologies have revolutionized the field of genomics and provided unprecedented opportunities for high-throughput analysis at the levels of genomics,transcriptomics and epigenetics.However,the cost of NGS is still prohibitive for many laboratories.It is imperative to address the trade-off between the sequencing depth and cost.In this review,we will discuss the effects of sequencing depth on the detection of genes,quantification of gene expression and discovering of gene structural variants.This will provide readers information on choosing appropriate sequencing depth that best meet the needs of their particular project.展开更多
基金supported in part by Hong Kong GRF HKU 7111/12E, 719611EShenzhen Basic Research Project JCYJ20120618143038947 (SIRI/04/04/2012/05)Outstanding Researcher Award (102009124).
文摘Sequence assembling is an important step for bioinformatics study.With the help of next generation sequencing(NGS)technology,high throughput DNA fragment(reads)can be randomly sampled from DNA or RNA molecular sequence.However,as the positions of reads being sampled are unknown,assembling process is required for combining overlapped reads to reconstruct the original DNA or RNA sequence.Compared with traditional Sanger sequencing methods,although the throughput of NGS reads increases,the read length is shorter and the error rate is higher.It introduces several problems in assembling.Moreover,paired-end reads instead of single-end reads can be sampled which contain more information.The existing assemblers cannot fully utilize this information and fails to assemble longer contigs.In this article,we will revisit the major problems of assembling NGS reads on genomic,transcriptomic,metagenomic and metatranscriptomic data.We will also describe our IDBA package for solving these problems.IDBA package has adopted several novel ideas in assembling,including using multiple k,local assembling and progressive depth removal.Compared with existence assemblers,IDBA has better performance on many simulated and real sequencing datasets.
文摘Next-generation sequencing(NGS) technologies have revolutionized the field of genomics and provided unprecedented opportunities for high-throughput analysis at the levels of genomics,transcriptomics and epigenetics.However,the cost of NGS is still prohibitive for many laboratories.It is imperative to address the trade-off between the sequencing depth and cost.In this review,we will discuss the effects of sequencing depth on the detection of genes,quantification of gene expression and discovering of gene structural variants.This will provide readers information on choosing appropriate sequencing depth that best meet the needs of their particular project.