期刊文献+

Examining the practical limits of batch effect-correction algorithms:When should you care about batch effects? 被引量:1

Examining the practical limits of batch effect-correction algorithms:When should you care about batch effects?
原文传递
导出
摘要 Batch effects are technical sources of variation and can confound analysis.While many performance ranking exercises have been conducted to establish the best batch effect-correction algorithm(BECA),we hold the viewpoint that the notion of best is context-dependent.Moreover,alternative questions beyond the simplistic notion of "best" are also interesting:are BECAs robust against various degrees of confounding and if so,what is the limit?Using two different methods for simulating class(phenotype) and batch effects and taking various representative datasets across both genomics(RNA-Seq) and proteomics platforms,we demonstrate that under situations where sample classes and batch factors are moderately confounded,most BECAs are remarkably robust and only weakly affected by upstream normalization procedures.This observation is consistently supported across the multitude of test datasets.BECAs do have limits:When sample classes and batch factors are strongly confounded,BECA performance declines,with variable performance in precision,recall and also batch correction.We also report that while conventional normalization methods have minimal impact on batch effect correction,they do not affect downstream statistical feature selection,and in strongly confounded scenarios,may even outperform BECAs.In other words,removing batch effects is no guarantee of optimal functional analysis.Overall,this study suggests that simplistic performance ranking exercises are quite trivial,and all BECAs are compromises in some context or another. Batch effects are technical sources of variation and can confound analysis.While many performance ranking exercises have been conducted to establish the best batch effect-correction algorithm(BECA),we hold the viewpoint that the notion of best is context-dependent.Moreover,alternative questions beyond the simplistic notion of "best" are also interesting:are BECAs robust against various degrees of confounding and if so,what is the limit?Using two different methods for simulating class(phenotype) and batch effects and taking various representative datasets across both genomics(RNA-Seq) and proteomics platforms,we demonstrate that under situations where sample classes and batch factors are moderately confounded,most BECAs are remarkably robust and only weakly affected by upstream normalization procedures.This observation is consistently supported across the multitude of test datasets.BECAs do have limits:When sample classes and batch factors are strongly confounded,BECA performance declines,with variable performance in precision,recall and also batch correction.We also report that while conventional normalization methods have minimal impact on batch effect correction,they do not affect downstream statistical feature selection,and in strongly confounded scenarios,may even outperform BECAs.In other words,removing batch effects is no guarantee of optimal functional analysis.Overall,this study suggests that simplistic performance ranking exercises are quite trivial,and all BECAs are compromises in some context or another.
出处 《Journal of Genetics and Genomics》 SCIE CAS CSCD 2019年第9期433-443,共11页 遗传学报(英文版)
基金 support from the National Research Foundation of Singapore NRF-NSFC(Grant No.NRF2018NRF-NSFC003SB-006)
关键词 BATCH effects BIOINFORMATICS Feature selection NORMALIZATION STATISTICS Batch effects Bioinformatics Feature selection Normalization Statistics
  • 相关文献

同被引文献1

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部