期刊文献+

基因集分析方法统计理论探讨

Statistical Theory of Gene-set Analysis Methods
下载PDF
导出
摘要 目的从统计理论角度探讨基因集分析方法,初步建立微阵列数据基因集统计分析理论框架。方法采用计算机模拟技术,比较不同原假设、理论分布生成方法在进行基因集分析时的统计学性质。结果自限性原假设方法ROC曲线下面积AUC为0.858。而竞争性原假设方法曲线下面积AUC为0.512。相同设定条件下,bootstrap方法的错误发现率(最高为0.015)低于permutation检验(最高为0.075);而permutation方法的检验效能(0.89)优于bootstrap法(0.67)。结论有效的基因集分析方法应在正确使用生物学注释基因的基础上,建立自限性原假设、采用基因表达水平标准化值构建基因集统计量并根据需求利用有效的随机化算法构建统计量的理论分布进行推断。 Objective To explore the gene set analysis methods theoretically,and construct the framew ork for dealing w ith microarray data.Methods Computer simulation technology w as used to compare the statistical performance of different gene set analysis approaches based on different null hypotheses and theoretic distribution generating method.Results The area under the ROC curve of competitive null hypothesis was 0.858,w hile that from self-contained null hypothesis w as 0.512.Under the same conditions,the false discovery rate(FDR) of permutation test(up to 0.075) w as higher than that of bootstrap test(up to 0.015) at the sacrifice of pow er to some extent,w hile the pow er of bootstrap test(0.67) w as low er than that of permutation test(0.89).Conclusion An effective gene set analysis method w as based not only on the proper use of annotation,but also on self-contained null hypothesis,appropriate gene set statistics established on normalized gene expression levels and the construction of theoretic distribution of statistics using suitable randomization algorithm.
出处 《中国卫生统计》 CSCD 北大核心 2013年第4期484-486,共3页 Chinese Journal of Health Statistics
基金 国家自然科学基金资助项目(81172770)
关键词 微阵列数据 基因集方法 统计理论 MONTE CARLO模拟 Microarray data Gene set analysis method Statistical theory Monte Carlo simulation
  • 相关文献

参考文献12

  • 1Tian al, Greenberg SA, Kong SW, et al. Discovering statistically signifi- cant pathways in expression profiling studies. Proc Nat Acad Sci ,2005, 102 (38) : 13544-13549.
  • 2Goeman J], Bhlmann P. Analyzing gene expression data in terms of gene sets : methodological issues. Bioinformatics, 2007,23 ( 8 ) : 980- 987.
  • 3曹文君,李运明,陈长生.两种基因集分析方法的有效性比较[J].中国卫生统计,2009,26(5):462-465. 被引量:1
  • 4Benjamini Y, Hochberg Y. Controlling the false discovery rate:a practi- cal and powerful approach to multiple Testing. J Roy Stat Soc, 1995,57 ( 1 ) :289-300.
  • 5Khatri P, Drghici S. Ontological analysis of gene expression data: cur- rent tools, limitations, and open problems. Bioinformatics, 2005,21 ( 18 ) :3587-3595.
  • 6Pavlidis P, Qin J, Arango V, et al. Using the gene ontology for microar- ray data mining : a comparison of methods and application to age effects in human prefrontal cortex. Neurochem. Res,2004,29 (6) : 1213-1222.
  • 7Barry WT, Nobel AB, Wright FA. Significance analysis of functional categories in gene expression studies: a structured permutation ap- proach. Bioinformatics, 2005,21 : 1943-1949.
  • 8Dinu I, Potter JD. Improving gene set analysis of microarray data by SAM-GS. BMC Bioinformatics ,2007,8:242 ( 1 ) - ( 13 ).
  • 9Subramanian A, Tamayo P, Mootha VK, et al. Gene set enrichment a- nalysis:a knowledge-based approach for interpreting genome-wide ex- pression profiles. Proc Nat Acad Sci, 2005,102 ( 43 ) : 15545-15550.
  • 10Hesterberg T, Moore DS, Monaghan S, et al. Bootstrap methods and Permutation tests. In DS Moore, GP McCabe ( eds. ), "Introduction to the Practice of Statistics," Freeman, New York,2005.

二级参考文献9

  • 1Dinu I,Potter JD.Improving gene set analysis of microarray data bySAM-GS[].BMC Bioinformatics.2007
  • 2Goeman JJ,Bhlmann P.Analyzing gene expression data in terms of genesets:methodological issues[].Bioinformatics.2007
  • 3Goeman JJ,van de Geer SA,de Kort F,et al.A global test for groups ofgenes:testing association with a clinical outcome[].Bioinformatics.2004
  • 4Liu Q,Dinu I,Adewale AJ,et al.Comparative evaluation of gene-set a-nalysis methods[].BMC Bioinformatics.2007
  • 5Subramanian A,Tamayo P,Mootha VK,et al.Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles[].Proceedings of the National Academy of Sciences of the United States of America.2005
  • 6Benjamini Y,Hochberg Y.Controlling the false discovery rate: a practical and powerful approach to multiple testing[].Journal of the Royal Statistical Society Series B Statistical Methodology.1995
  • 7Storey,JD.A direct approach to false discovery rates[].Journal of the Royal Statistical Society Series B Statistical Methodology.2002
  • 8Barry,WT,Nobel,AB,Wright,FA.Significance analysis of functional categories in gene expression studies: a structured permutation approach[].Bioinformatics.2005
  • 9Draghici S,Khatri P,Martins R P,et al.Global functional profiling of gene expression[].Genomics.2003

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部