期刊文献+

多重检验技术在大数据分析中的应用

Application of Multiple Test Techniques in Big Data Analysis
下载PDF
导出
摘要 在对大数据进行假设检验时,为了控制假阳性,需要采用多重检验技术。多重检验技术有多种,本文通过对大数据进行实际分析,比较各种算法的优缺点,给出不同方法的适用场合,从而对数据分析人员给以理论上的指导。文章首先阐述多重检验的必要性以及多重检验的相关概念;其次分别介绍对总体错误率和错误发现率进行控制的两类方法;最后将这几种多重检验方法应用到基因大数据中对基因的表达与否进行判断。实验结果表明,控制错误发现率的方法优于控制总体错误率的方法,在控制错误发现率的方法中,q值法的结果最好。原因在于q值法考虑了原假设的先验信息,能很好地控制错误发现率的大小,因此具有较高的精确性和检验功效。 In the hypothesis test of big data, in order to control false positives, multiple test technology needs to be used. There are many kinds of multiple test techniques. This paper makes a practical analysis of big data, compares the advantages and disadvantages of various algorithms, and gives the application occasions of different methods, so as to give theoretical guidance to data analysts. Firstly, this paper expounds the necessity and the related concepts of multiple testing;Secondly, two kinds of methods to control the family-wise error rate and false discovery rate are introduced respectively;Finally, these multiple test methods are applied to gene big data to judge whether the genes are expressed or not. The experimental results show that the method of controlling the false discovery rate is better than the method of controlling the family-wise error rate. Among the methods of controlling the false discovery rate, the q-value method has the best result. The reason is that the q-value method considers the prior information of the original hypothesis and can well control the false discovery rate, so it has high accuracy and power.
出处 《应用数学进展》 2021年第10期3532-3538,共7页 Advances in Applied Mathematics
  • 相关文献

参考文献2

二级参考文献26

  • 1Schena M. Microarray analysis [M]. New York: John Wiley&Sons, 2003.
  • 2Dudoit S, Shaffer JP, Boldrick JC. Multiple hypothesis testing in microarray experiments[J]. Statist Sci, 2003, 18 (1): 71-103.
  • 3Mcconnell P, Lin SM, Hurban P. Methods of microarray data analysis V[M]. New York: Springer, 2007.
  • 4Efron B. Large-Scale inference: empirical bayes methods for esti- mation, testing, and prediction [ M ]. New York : Cambridge Uni- versity Press, 2010.
  • 5Lehmann EL, Romano J. Testing Statistical Hypotheses [M]. 3rd edition. New York: Springer, 2005.
  • 6Shaffer JP. Multiple hypothesis testing [J]. Annu Rev Psychol, 1995, 46: 561-584.
  • 7Bretz F, Hothorn T, Westfall P. Multiple comparisons using R [M]. London: Chapman&Hall, 2010.
  • 8Benjamini Y, Hochberg Y. Controlling the false discovery rate: A practical and powerful approach to multiple testing [J]. J R Statist Soc B, 1995, 57 (1): 289-300.
  • 9Holm S. A simple sequentially rejective multiple test procedure[J]. Scand J Statist 1979, 6: 65-70.
  • 10Hommel G. A stagewise rejective multiple test procedure based on a modified Bonferroni test[J]. Biometrika, 1988, 75: 383-386.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部