RNA-Seq technology is becoming widely used in various transcriptomics studies;however,analyzing and interpreting the RNA-Seq data face serious challenges.With the development of high-throughput sequencing technologies...RNA-Seq technology is becoming widely used in various transcriptomics studies;however,analyzing and interpreting the RNA-Seq data face serious challenges.With the development of high-throughput sequencing technologies,the sequencing cost is dropping dramatically with the sequencing output increasing sharply.However,the sequencing reads are still short in length and contain various sequencing errors.Moreover,the intricate transcriptome is always more complicated than we expect.These challenges proffer the urgent need of efficient bioinformatics algorithms to effectively handle the large amount of transcriptome sequencing data and carry out diverse related studies.This review summarizes a number of frequently-used applications of transcriptome sequencing and their related analyzing strategies,including short read mapping,exon-exon splice junction detection,gene or isoform expression quantification,differential expression analysis and transcriptome reconstruction.展开更多
De novo transcriptome assembly is an important approach in RNA-Seq data analysis and it can help us to reconstruct the transcriptome and investigate gene expression profiles without reference genome sequences.We carri...De novo transcriptome assembly is an important approach in RNA-Seq data analysis and it can help us to reconstruct the transcriptome and investigate gene expression profiles without reference genome sequences.We carried out transcriptome assemblies with two RNA-Seq datasets generated from human brain and cell line,respectively.We then determined an efficient way to yield an optimal overall assembly using three different strategies.We first assembled brain and cell line transcriptome using a single k-mer length.Next we tested a range of values of k-mer length and coverage cutoff in assembling.Lastly,we combined the assembled contigs from a range of k values to generate a final assembly.By comparing these assembly results,we found that using only one k-mer value for assembly is not enough to generate good assembly results,but combining the contigs from different k-mer values could yield longer contigs and greatly improve the overall assembly.展开更多
基金supported by the National Basic Research Program of China (Grant Nos. 2010CB945401, 2007CB108800)National Natural Science Foundation of China (Grant Nos. 30870575,31071162,31000590)Science and Technology Commission of Shanghai Municipality (Grant No. 11DZ2260300)
文摘RNA-Seq technology is becoming widely used in various transcriptomics studies;however,analyzing and interpreting the RNA-Seq data face serious challenges.With the development of high-throughput sequencing technologies,the sequencing cost is dropping dramatically with the sequencing output increasing sharply.However,the sequencing reads are still short in length and contain various sequencing errors.Moreover,the intricate transcriptome is always more complicated than we expect.These challenges proffer the urgent need of efficient bioinformatics algorithms to effectively handle the large amount of transcriptome sequencing data and carry out diverse related studies.This review summarizes a number of frequently-used applications of transcriptome sequencing and their related analyzing strategies,including short read mapping,exon-exon splice junction detection,gene or isoform expression quantification,differential expression analysis and transcriptome reconstruction.
基金supported by the National Basic Research Program of China (Grant Nos. 2010CB945401, 2007CB108800)National Natural Science Foundation of China (Grant Nos. 30870575, 31071162,31000590)the Science and Technology Commission of Shanghai Municipality (Grant No. 11DZ2260300)
文摘De novo transcriptome assembly is an important approach in RNA-Seq data analysis and it can help us to reconstruct the transcriptome and investigate gene expression profiles without reference genome sequences.We carried out transcriptome assemblies with two RNA-Seq datasets generated from human brain and cell line,respectively.We then determined an efficient way to yield an optimal overall assembly using three different strategies.We first assembled brain and cell line transcriptome using a single k-mer length.Next we tested a range of values of k-mer length and coverage cutoff in assembling.Lastly,we combined the assembled contigs from a range of k values to generate a final assembly.By comparing these assembly results,we found that using only one k-mer value for assembly is not enough to generate good assembly results,but combining the contigs from different k-mer values could yield longer contigs and greatly improve the overall assembly.