期刊文献+

Comparison of dimension reduction-based logistic regression models for case-control genome-wide association study:principal components analysis vs.partial least squares 被引量:2

Comparison of dimension reduction-based logistic regression models for case-control genome-wide association study:principal components analysis vs.partial least squares
下载PDF
导出
摘要 With recent advances in biotechnology, genome-wide association study (GWAS) has been widely used to identify genetic variants that underlie human complex diseases and traits. In case-control GWAS, typical statistical strategy is traditional logistical regression (LR) based on single-locus analysis. However, such a single-locus analysis leads to the well-known multiplicity problem, with a risk of inflating type I error and reducing power. Dimension reduction-based techniques, such as principal component-based logistic regression (PC-LR), partial least squares-based logistic regression (PLS-LR), have recently gained much attention in the analysis of high dimensional genomic data. However, the perfor- mance of these methods is still not clear, especially in GWAS. We conducted simulations and real data application to compare the type I error and power of PC-LR, PLS-LR and LR applicable to GWAS within a defined single nucleotide polymorphism (SNP) set region. We found that PC-LR and PLS can reasonably control type I error under null hypothesis. On contrast, LR, which is corrected by Bonferroni method, was more conserved in all simulation settings. In particular, we found that PC-LR and PLS-LR had comparable power and they both outperformed LR, especially when the causal SNP was in high linkage disequilibrium with genotyped ones and with a small effective size in simulation. Based on SNP set analysis, we applied all three methods to analyze non-small cell lung cancer GWAS data. With recent advances in biotechnology, genome-wide association study (GWAS) has been widely used to identify genetic variants that underlie human complex diseases and traits. In case-control GWAS, typical statistical strategy is traditional logistical regression (LR) based on single-locus analysis. However, such a single-locus analysis leads to the well-known multiplicity problem, with a risk of inflating type I error and reducing power. Dimension reduction-based techniques, such as principal component-based logistic regression (PC-LR), partial least squares-based logistic regression (PLS-LR), have recently gained much attention in the analysis of high dimensional genomic data. However, the perfor- mance of these methods is still not clear, especially in GWAS. We conducted simulations and real data application to compare the type I error and power of PC-LR, PLS-LR and LR applicable to GWAS within a defined single nucleotide polymorphism (SNP) set region. We found that PC-LR and PLS can reasonably control type I error under null hypothesis. On contrast, LR, which is corrected by Bonferroni method, was more conserved in all simulation settings. In particular, we found that PC-LR and PLS-LR had comparable power and they both outperformed LR, especially when the causal SNP was in high linkage disequilibrium with genotyped ones and with a small effective size in simulation. Based on SNP set analysis, we applied all three methods to analyze non-small cell lung cancer GWAS data.
出处 《The Journal of Biomedical Research》 CAS CSCD 2015年第4期298-307,共10页 生物医学研究杂志(英文版)
基金 founded by the National Natural Science Foundation of China(81202283,81473070,81373102 and81202267) Key Grant of Natural Science Foundation of the Jiangsu Higher Education Institutions of China(10KJA330034 and11KJA330001) the Research Fund for the Doctoral Program of Higher Education of China(20113234110002) the Priority Academic Program for the Development of Jiangsu Higher Education Institutions(Public Health and Preventive Medicine)
关键词 principal components analysis partial least squares-based logistic regression genome-wide association study type I error POWER principal components analysis, partial least squares-based logistic regression, genome-wide association study,type I error, power
  • 相关文献

参考文献1

二级参考文献44

  • 1Jemal A, Bray F, Center MM, FerJay J, Ward E, Forman D. Global cancer statistics. CA Cancer J Clin 2011; 6l: 69-90.
  • 2Brennan P, Hainaut P, Boffetta P. Genetics of lung-can-cer susceptibility. Lancet OncoI2011; 12: 399-408.
  • 3Hung RJ, McKay JD, Gaborieau V, Boffetta P, Hash-ibe M, Zaridze D, et al. A susceptibility locus for lung cancer maps to nicotinic acetylcholine receptor subunit genes on 15q25. Nature 2008; 452: 633-7.
  • 4McKay JD, Hung RJ, Gaborieau V, Boffetta P, Chabrier A, Byrnes G, et al. Lung cancer susceptibility locus at 5p15. 33. Nat Genet 2008; 40: 1404-6.
  • 5Wang Y, Broderick P, Webb E, Wu X, Vijayakrishnan J, Matakidou A, et al. Common 5p15. 33 and 6p21. 33 variants influence lung cancer risk. Nat Genet 2008; 40: 1407-9.
  • 6Landi MT, Chatterjee N, Yu K, Goldin LR, Goldstein AM, Rotunno M, et al. A genome-wide association study of lung cancer identifies a region of chromosome 5p15 associated with risk for adenocarcinoma. The American Journal of Human Genetics 2009; 85: 679-91.
  • 7Hsiung CA, Lan Q, Hong YC, Chen CJ, Hosgood HD, Chang IS, et al. The 5p15. 33 locus is associated with risk of lung adenocarcinoma in never-smoking females in Asia. PLoS Genet 2010; 6: el00l051.
  • 8Miki D, Kubo M, Takahashi A, Yo on KA, Kim J, Lee GK, et al. Variation in TP63 is associated with lung adenocarcinoma susceptibility in Japanese and Korean populations. Nat Genet 2010; 42: 893-6.
  • 9Hu Z, Wu C, Shi Y, Guo H, Zhao X, Yin Z, et al. A genome-wide association study identifies two new lung cancer susceptibility loci at 13q12. 12 and 22q12. 2 in Han Chinese. Nat Genet 2011; 43: 792-6.
  • 10Shiraishi K, Kunitoh H, Daigo Y, Takahashi A, Goto K, Sakamoto H, et al. A genome-wide association study identifies two new susceptibility loci for lung adenocar-cinoma in the Japanese population. Nat Genet 2012; 44: 900-3.

同被引文献18

引证文献2

二级引证文献11

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部