期刊文献+

使用肺癌GWAS数据进行遗传风险预测的方法和策略研究 被引量:5

Strategies of Genetic Risk Prediction with Lung Cancer GWAS Data
下载PDF
导出
摘要 目的探讨基于肺癌全基因组关联研究数据的遗传风险预测方法和策略。方法将肺癌GWAS数据中的南京子样本和北京子样本分别作为训练集和测试集,分别使用预测全集和最优预测子集两种策略,比较三种预测方法在不同连锁不平衡结构(LD)和初筛检验水准(α)下的预测准确度。结果 w GRS在高LD结构下,随着-log(α)增大,预测准确度呈现上升趋势;RF和SVM对LD结构不如w GRS敏感,但三种方法在低LD结构(r2<0.2)下预测准确度优于高LD结构;w GRS方法下最优预测子集效果略优于预测全集效果,SVM下子集效果与全集近似,但略逊于全集,RF下子集效果则不如全集,且差距较大。结论基于LD结构修剪SNP位点和选择适当的初筛水准可以提高遗传风险预测准确度,此时w GRS方法预测效果优于SVM和RF。 Objective To investigate the performance of three genetic risk prediction methods, weighted genetic risk score ( wGRS ), support vector machine ( SVM ) and random forest ( RF), applied to high dimensional data of lung cancer with two strategies. Methods This study served Nanjing and Beijing samples of GWAS data as training set and testing set respectively. We made use of the two strategies of Full predictive subset(FS) and Best predictive subset(BS) and compared the prediction ac- curacy within the three methods mentioned above with the combination of Linkage Disequilibrium (LD) and hypothesis testing levels(α). Results Under a high LD structure, the prediction accuracy of wGRS was on the rise with the increasing -log (α). RF and SVM were not sensitive to LD structures as wGRS, but the predictive accuracy of each method applied with a low LD structure( r2 〈 0. 2)was mainly better than itself with a high LD structure. Moreover, the performance of B S was slightly better than, approximately equal to or tiny less than and worse than FS when the methods were respectively wGRS, SVM and RF. Con- elusion The prediction accuracy could be improved with the condition of LD-pruning and adopting a proper a-value, mean- while, wGRS was better than SVM and RF in that condition.
出处 《中国卫生统计》 CSCD 北大核心 2015年第4期554-557,共4页 Chinese Journal of Health Statistics
基金 国家自然科学基金(81473070 81373102)
关键词 肺癌 遗传风险得分 支持向量机 随机森林 最优预测子集 单核苷酸多态性 Lung cancer Genetic risk score Support vector machine Random forest Best predictive subset Single nucleotide polymorphism
  • 相关文献

参考文献32

  • 1Welter D, MacArthur J, Morales J, et al. The NHGRI GWAS Cata- log, a curated resource of SNP-trait associations. Nucleic Acids Res, 2014,42 ( Database issue) : D1001-1006.
  • 2van der Net JB, Janssens AC, Sijbrands EJ, et al. Value of genetic profiling for the prediction of coronary heart disease. Am Heart J, 2009,158( 1 ) :105-110.
  • 3Mihaescu R, Meigs J, Sijbrands E, et al. Genetic risk profiling for pre- diction of type 2 diabetes. PLoS Curr, 2011,3 : RRN1208.
  • 4Cook NR. Use and misuse of the receiver operating characteristic curve in risk prediction. Circulation ,2007,115 ( 7 ) :928-935.
  • 5McCarthy MI, Abecasis GR, Cardon LR, et al. Genome-wide associa- tion studies for complex traits:consensus, uncertainty and challenges. Nat Rev Genet,2008,9 ( 5 ) :356-369.
  • 6Manolio TA. Genomewide association studies and assessment of the risk of disease. N Engl J Med,2010,363 (2) :166-176.
  • 7Hu Z, Wu C, Shi Y, et al. A genome-wide association study identifies two new lung cancer susceptibility loci at 13q12.12 and 22q12.2 in Han Chinese. Nature genetics, 2011,43 ( 8 ) : 792 -796.
  • 8陈峰,柏建岭,赵杨,荀鹏程.全基因组关联研究中的统计分析方法[J].中华流行病学杂志,2011,32(4):400-404. 被引量:11
  • 9Speliotes EK, Wilier CJ, Bern& SI, et al. Association analyses of 249,796 individuals reveal 18 new loci associated with body mass in- dex. Nat Genet ,2010,42 ( 11 ) :937-948.
  • 10Ripatti S, Tikkanen E, Orho-Melander M, et al. A multilocus genetic risk score for coronary heart disease:case-control and prospective co- hort analyses. Lancet,2010,376(9750) :1393-1400.

二级参考文献38

  • 1Hardy J, Singleton A. Genomewide association studies and human disease. N Engl J Med,2009,360(17) : 1759-1768.
  • 2Zhang X J, Huang W, Yang S, et al. Psoriasis genome-wide association study identifies susceptibility variants within LCE gene cluster at lq21. Nat Genet,2009,41 (2) :205-210.
  • 3Han JW, Zheng HF, Cui Y, et al. Genome-wide association study in a Chinese Han population identifies nine new susceptibility loci for systemic lupus erythematosus. Nat Genet, 2009,41 ( 11 ) : 1234-1237.
  • 4Zhang FR, Huang W, Chen SM, et al. Genomewide association study of leprosy. N Engl J Med, 2009,361 (27) : 2609-2618.
  • 5Lei SF, Yang TL, Tan LJ, et al. Genome-wide association scan for stature in Chinese: evidence for ethnic specific loci. Hum Genet, 2009,125( 1 ) ~ 1-9.
  • 6Guo Y, Tan LJ, Lei SF, et al. Genome-wide association study identifies ALDH7A1 as a novel susceptibility gene for osteoporosis. PLoS Genet, 2010,6( 1 ) : e1000806.
  • 7Bei JX, Li Y, Jia WH, et al. A genome-wide association study of nasopharyngeal carcinoma identifies three new susceptibility loci. Nat Genet, 2010,42 (7) : 599-603.
  • 8Wu C, Xu B, Yuan P, et al. Genome-wide examination of genetic variants associated with response to platinum-based chemotherapy in patients with small-cell lung cancer. Pharmacogenet Genomics, 2010,20(6) : 389-395.
  • 9Quan C, Ren YQ, Xiang LH, et al. Genome-wide association study for vitiligo identifies susceptibility loci at 6q27 and the MHC. Nat Genet,2010,42(7) :614-618.
  • 10Zhang H,Zhai Y,Hu Z,et al. Genome-wide association study identifies lp36.22 as a new susceptibility locus for hepatocellular carcinoma in chronic hepatitis B virus carriers. Nat Genet, 2010, 42(9) : 755-758.

共引文献10

同被引文献25

引证文献5

二级引证文献14

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部