期刊文献+

病例对照设计为基础的候选基因关联研究中交互作用的统计方法进展 被引量:4

Progress of statistical methods for testing interactions in candidate gene association studies based on case-control design
下载PDF
导出
摘要 候选基因关联研究中基因-基因、基因-环境交互作用的统计分析有利于揭示疾病的发生机制。本文针对病例对照设计的候选基因关联研究,综述交互作用的统计方法及其进展。交互作用的统计方法包括参数法和非参数法。参数法中最常用的为Logistic回归模型,非参数法主要是数据挖掘方法。有4类数据挖掘方法可用于候选基因关联研究,包括降维法、基于树的方法、模式识别法和贝叶斯法。本文对最常用且可靠的几种数据挖掘方法(多因子降维法、分类回归树、随机森林、贝叶斯上位效应关联图谱)的原理、分析过程和优缺点予以比较。参数法和非参数法分析交互作用时各有优缺点;低维数据的分析可采用参数法和非参数法,高维数据的分析则主要采用非参数法。随着基因分型技术的发展,可检测的SNP规模逐渐增大,使得非参数方法的应用越来越广。 Testing for gene-gene and gene-environment interactions in candidate gene association studies will help to reveal possible mechanisms underlying diseases.This article summarized the progress of statistical methods for testing interactions in candidate gene association studies based on case-control design.Parametric and non-parametric methods can be used to detect the interactions.Logistic regression is the most frequently used parametric method,and data mining techniques offer a variety of alternative non-parametric methods.Data mining techniques that can be applied in association studies consist of dimension reduction,tree-based approach,pattern recognition and Bayesian methods.Among alternative non-parametric methods we concentrated on the four methods which have become popular and are reliable for detection of interactions,including multifactor dimensionality reduction(MDR),classification and regression tree(CART),random forest,and Bayesian epistasis association mapping(BEAM).The principles,procedures,advantages and disadvantages of these methods have been discussed.Either parametric or non-parametric methods have the weak and the strong.For low-dimensional data,both parametric and non-parametric methods can be used in association studies.For high-dimensional data,non-parametric methods are the best choice.With the development of genotyping technologies and the scale of SNP database becoming large,non-parametric methods are used more and more widely in association studies.
出处 《复旦学报(医学版)》 CAS CSCD 北大核心 2011年第3期265-270,共6页 Fudan University Journal of Medical Sciences
基金 国家自然科学基金项目(30271113) 国家科技部973项目(2002CB512902) 上海市劳动卫生学重点学科建设计划(08GWZX0402)
关键词 候选基因关联研究 病例对照设计 交互作用 数据挖掘 candidate gene association studies case-control design interaction data mining
  • 相关文献

参考文献37

  • 1Ramos RG,Olden K.Gene-environment interactions in the development of complex disease phenotypes[J].Int J Environ Res Public Health,2008,5(1):4-11.
  • 2Garte S,Taioli E,Popov T,et al.Genetic susceptibility to benzene toxicity in humans[J].J Toxicol Environ Health A,2008,71(22):1 482-1 489.
  • 3Culverhouse R,Suarez BK,Lin J,et al.A perspective on epistasis:limits of models displaying no main effect[J].Am J Hum Genet,2002,70(2):461-471.
  • 4Dempfle A,Scherag A,Hein R,et al.Gene-environment interactions for complex traits:definitions,methodological requirements and challenges[J].Eur J Hum Genet,2008,16(10):1 164-1 172.
  • 5Cordell HJ.Epistasis:what it means,what it doesn't mean,and statistical methods to detect it in humans[J].Hum Mol Genet,2002,11(20):2 463-2 468.
  • 6Hosmer DW,Lemeshow S.Applied logistic regression (2nd)[M].New York:John Wiley & Sons,Inc,2000:339-347.
  • 7Heidema AG,Boer JM,Nagelkerke N,et al.The challenge for genetic epidemiologists:how to analyze large numbers of SNPs in relation to complex diseases[J].BMC Genet,2006,7:23-38.
  • 8Cordell HJ.Detecting gene-gene interactions that underlie human diseases[J].Nat Rev Genet,2009,10(6):392-404.
  • 9Moore JH,Asselbergs FW,Williams SM.Bioinformatics challenges for genome-wide association studies[J].Bioinformatics,2010,26(4):445-455.
  • 10Ritchie MD,Hahn LW,Roodi N,et al.Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer[J].Am J Hum Genet,2001,69(1):138-147.

二级参考文献4

  • 1Marko R.Improving Random Forests.Machine Learning.ECML Proceedings,Springer,Berlin,2004.
  • 2Ramón D,Sara Alvarez DA.Gene selection and classification of microarray data using random Forest.BMC Bioinformatics,2006,http://www.biomedcentral.com/1471-2105/7/3.
  • 3Liaw A,Wiener M.Classification and regression by randomForest.Rnews,2002,2:18-22.
  • 4Leo B.Random Forests.Statistics Department University of California Berkeley,CA 94720,January 2001.

共引文献20

同被引文献90

引证文献4

二级引证文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部