期刊文献+

微阵列数据扰动对错误发现率方法筛选差异表达基因的影响 被引量:2

Effect of data perturbation in microarray on selecting differentially expressed genes by false discovery rate
下载PDF
导出
摘要 目的:探讨微阵列数据的扰动对错误发现率(FDR)方法筛选差异表达基因的影响。方法:用计算机模拟仿真的方法,对1 991个结肠癌微阵列基因数据给予不同相对误差限的随机扰动,每个扰动进行1 000次随机模拟;用FDR的ALSU方法对无扰动数据与有扰动数据分别筛选差异表达基因,比较两者之间的重复率;分析数据扰动对每次基因排序位次变化的影响。结果:差异表达基因的单个平均重复率与总体平均重复率都随数据扰动的增加而下降。差异表达越显著的基因,受扰动误差的影响越小;在扰动误差限≤50%时,数据扰动与差异表达基因总体平均重复率呈线性递减趋势,数据扰动误差限每增加1%,总体平均重复率约下降1.85%。扰动误差限越大,基因排序位次的波动越大。结论:数据扰动是导致差异表达基因可重复性差的原因,用计算机模拟的方法可定量探讨数据扰动对差异基因筛选的影响。 Objective:To investigate the effect of data perturbation in microarray on selecting differentially expressed genes by false discovery rate(FDR). Methods:A total of 1 991 DNA microarray data of colon cancer were afforded random perturbation of different error limits based on a computer simulation. Every perturbation comprised 1 000 random simulations. The differentially expressed genes were selected from data with and without perturbation,respectively,by adaptive linear step-up(ALSU),a method of FDR. The repetition rates between both results were compared. The effect of each gene sort order was analyzed by data perturbation.Results:The single average and overall average repetition rates of differentially expressed genes both decreased with increasing data perturbation. The more significant differentially expressed the genes,the less they were affected by perturbation. When the error limit was less than or equal to 50%,the overall average repetition rate of differentially expressed genes decreased with increasing data perturbation linearly. For each 1% increase of perturbation error limit,the overall average repetition rate decreased approximately by1.85%. The higher the perturbation error limit,the greater the fluctuation the gene sort order had. Conclusion:Data perturbation is a reason why differentially expressed genes exhibit low repeatability;the effect of data perturbation on selecting differentially expressed genes can be quantitatively investigated by using computer simulation.
出处 《南京医科大学学报(自然科学版)》 CAS CSCD 北大核心 2014年第7期991-995,1002,共6页 Journal of Nanjing Medical University(Natural Sciences)
基金 江苏省高校自然科学基金(13KJB310007) 南京医科大学科技发展基金重点项目(2013NJMU006)
关键词 微阵列 差异表达基因 错误发现率 数据扰动 microarray differentially expressed genes false discovery rate data perturbation
  • 相关文献

参考文献16

  • 1Schena M,Shalon D,Davis RW,et al.Quantitative moni-toring of gene expression patterns with a complementary DNA microarray [ J ].Science,1995,270(20):467-470.
  • 2Sehena M,Shalon D,Heller R,et al.Parallel human genome analysis:Microarray based expression monitoring of 1000 genes [J].Proc Natl Acad Sci USA,1996,93(20):10614-10619.
  • 3Ein-Dor L,Zuk O,Domany E.Thousands of samples ale needed to generate a robust gene list for predicting out-come in cancer [J].Proc Natl Acad Sci USA,2006,103(15):5923-5928.
  • 4Zhang M,Yao C,Guo Z,et al.Apparently low repro-ducibility of true differential expression discoveries in microarray studies [J].Bioinformatics,2008,24(18):2057-2063.
  • 5邹金凤,郝春香,洪贵妮,郭政.乳腺癌转移相关基因与功能识别的可重复性[J].生物信息学,2012,10(1):27-30. 被引量:1
  • 6Lee ML,Kuo FC,Whitmore GA,et al.Importance of replication in microarray gene expression studies:statisti-cal methods and evidence from repetitive cDNA hy-bridizations [J].Proc Natl Acad Sci,2000,97(18):9834-9839.
  • 7罗瑶,许宏,李瑶,韩志勇,裘敏燕,陈沁,刘三震,倪胜,谢毅,毛裕民.表达谱基因芯片的可靠性验证分析[J].Acta Genetica Sinica,2003,30(7):611-618. 被引量:11
  • 8荀鹏程,赵杨,柏建岭,易洪刚,于浩,陈峰.微阵列数据的多重比较[J].中国卫生统计,2006,23(1):5-8. 被引量:12
  • 9Benjamini Y,Hochberg Y.Controlling the false discovery rate:A practical and powerful approach to multiple test-ing [ J ].J R Statist Soc B,1995,57(1):289-300.
  • 10Benjamini Y,Liu W.A step-down muhiple testing proce-dure that controls the false discovery rate under indepen-dence[J].J Sta Plan Infer,1999,82(1-2):163-170.

二级参考文献62

  • 1荀鹏程,赵杨,柏建岭,易洪刚,于浩,陈峰.微阵列数据的多重比较[J].中国卫生统计,2006,23(1):5-8. 被引量:12
  • 2成军,孙关忠,李早荣,许祝安,陈杰.相对残差法线性回归与相关的理论研究──回归模型的建立及实例分析[J].中国卫生统计,1996,13(3):37-39. 被引量:5
  • 3Benjamini Y, Hochberg, Y. Controlling the false discovery rate: A practical and powerful approach to multiple testing [J]. Roy. Statist. Soc. Ser. B, 1995,57:289-300.
  • 4Benjamini Y, Liu W. A step-down multiple testing procedure that controls the false discovery rate under independence [J]. Statist. Plann. Inference, 1999,82:163-170.
  • 5Benjamini Y, Yekutieli D. The control of the false discovery rate in multiple testing under dependency[J]. Ann.Statist, 2001,29:1165- 1188.
  • 6Yoav Benjamini, Abba M Krieger Y, Daniel Yekutieli. Adaptive Linear Step-up Procedures that control the False Discovery Kate[J]. Biometrika, 2006,93(3):491-507.
  • 7Ein -Dor L, Zuk O and Domany E. Thousands of samples are needed to generate a robust gene list for predicting outcome in cancer [ J ]. Proc Natl Acad Sci U S A, 2006, 103 (15) : 5923 - 5928.
  • 8Ein - Dor L, Kela I, Getz G, Givol D and Domany E. Outcome signature genes in breast cancer: is there a unique set? [ J] Bioin- formatics, 2005, 21(2): 171 -178.
  • 9Massague J. Sorting out breast - cancer gene signatures [ J ]. N Engl J Med, 2007, 556 (3) : 294 - 297.
  • 10Weigeh B, Wessels LF, Bosma AJ, Glas AM, Nuyten DS, He YD, Dai H, Peterse JL and vant Veer LJ. No common denomina- tor for breast cancer lymph node metastasis [ J ]. Br J Cancer, 2005, 93 (8) : 924 - 932.

共引文献27

同被引文献5

引证文献2

二级引证文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部