期刊文献+

Effects of subsampling on characteristics of RNA-seq data from triple-negative breast cancer patients

Effects of subsampling on characteristics of RNA-seq data from triple-negative breast cancer patients
下载PDF
导出
摘要 Background:Data from RNA-seq experiments provide a wealth of information about the transcriptome of an organism.However,the analysis of such data is very demanding.In this study,we aimed to establish robust analysis procedures that can be used in clinical practice.Methods:We studied RNA-seq data from triple-negative breast cancer patients.Specifically,we investigated the subsampling of RNA-seq data.Results:The main results of our investigations are as follows:(1) the subsampling of RNA-seq data gave biologically realistic simulations of sequencing experiments with smaller sequencing depth but not direct scaling of count matrices;(2) the saturation of results required an average sequencing depth larger than 32 million reads and an individual sequencing depth larger than 46 million reads;and(3) for an abrogated feature selection,higher moments of the distribution of all expressed genes had a higher sensitivity for signal detection than the corresponding mean values.Conclusions:Our results reveal important characteristics of RNA-seq data that must be understood before one can apply such an approach to translational medicine. Background:Data from RNA-seq experiments provide a wealth of information about the transcriptome of an organism.However,the analysis of such data is very demanding.In this study,we aimed to establish robust analysis procedures that can be used in clinical practice.Methods:We studied RNA-seq data from triple-negative breast cancer patients.Specifically,we investigated the subsampling of RNA-seq data.Results:The main results of our investigations are as follows:(1) the subsampling of RNA-seq data gave biologically realistic simulations of sequencing experiments with smaller sequencing depth but not direct scaling of count matrices;(2) the saturation of results required an average sequencing depth larger than 32 million reads and an individual sequencing depth larger than 46 million reads;and(3) for an abrogated feature selection,higher moments of the distribution of all expressed genes had a higher sensitivity for signal detection than the corresponding mean values.Conclusions:Our results reveal important characteristics of RNA-seq data that must be understood before one can apply such an approach to translational medicine.
出处 《Chinese Journal of Cancer》 SCIE CAS CSCD 2015年第10期427-438,共12页
基金 supported In part by the Arkansas Biosciences Institute under Grant(No.UL1TR000039) the IDeANetworks of Biomedical Research Excellence(INBRE) Grant(No.P20RR16460)
关键词 RNA-SEQ data Computational genomics Statistical robustness HIGH-DIMENSIONAL biology Triple-negative breast cancer RNA-seq data Computational genomics Statistical robustness High-dimensional biology Triple-negative breast cancer
  • 相关文献

参考文献43

  • 1McGettigan PA. Transcriptomics in the RNA-seq era. Curr Opin Chem Biol.2013;17(1):4–11.
  • 2Marguerat S, Bler J. RNA-seq: from technology to biology. Cell Mol LifeSci. 2010;67(4):569–79.
  • 3Metzker ML. Sequencing technologies–the next generation. Nat RevGenet. 2009;11(1):31–46.
  • 4Wang Z, Gerstein M, Snyder M. RNA-Seq: a revolutionary tool for transcriptomics.Nat Rev Genet. 2009;10(1):57–63.
  • 5Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B. Mapping andquantifying mammalian transcriptomes by RNA-seq. Nat Methods.2008;5(7):621–8.
  • 6Peng Z, Cheng Y, Tan BC, Kang L, Tian Z, Zhu Y, et al. Comprehensiveanalysis of RNA-Seq data reveals extensive RNA editing in a humantranscriptome. Nat Biotechnol. 2012;30(3):253–60.
  • 7Beane J, Vick J, Schembri F, Anderlind C, Gower A, Campbell J, et al.Characterizing the impact of smoking and lung cancer on the airwaytranscriptome using RNA-Seq. Cancer Prev Res. 2011;4(6):803–17.
  • 8Sinicropi D, Qu K, Collin F, Crager M, Liu ML, Pelham RJ, et al. Whole transcriptomeRNA-Seq analysis of breast cancer recurrence risk using formalin-fixed paraffin-embedded tumor tissue. PLoS One. 2012;7(7):e40092.
  • 9Anders S, Huber W. Differential expression analysis for sequence countdata. Genome Biol. 2010;11(10):R106.
  • 10Rahmatallah Y, Emmert-Streib F, Glazko G. Comparative evaluationof gene set analysis approaches for RNA-Seq data. BMC Bioinform.2014;15:397.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部