摘要
由于肺腺癌早期病征不够明显,传统的检测方法难以达到早期临床诊断的要求.目前,基于甲基化分子标记进行癌症的早期诊断则展现出很好的发展前景.本研究在训练集中筛选出在肺腺癌与正常样本中甲基化差异度最大的10个甲基化探针,并基于此构建广义线性诊断模型,同时引入Lasso方法进行模型的变量选择.最终得到由4个探针(分别对应于基因TRIM58、HOXA9、HOXB4、PRAC)作为变量的诊断模型,并提供了合理的分类阈值区间.在3个测试集使用该模型都表现出很好的诊断效果,ROC曲线的AUC均在0.99以上.
Since there is lack of obvious symptoms in the early stage of lung adenocarcinoma,the traditional detection methods hardly meet the requirements of early clinical diagnosis.Currently,early detection of lung adenocarcinoma using DNA methylation biomarkers shows great promise.In this study,after analyzing of the training dataset including tumor and normal samples we chose the ten most differentially methylated probes.These ten probes are then used to build the general linear model to do lung adenocarcinoma diagnosis.It should be noted that,Lasso method is introduced in the model to perform variable selection.Finally,the lung adenocarcinoma diagnosis model is built based on the methylation level of four probes corresponding to four genes:TRIM58,HOXA9,HOXB4 and PRAC.And a reasonable classification score threshold interval is provided.The diagnosis performance of the model is pretty good when applying it to three independent test datasets,and the AUCs of all three ROC curves are greater than 0.99.
出处
《复旦学报(自然科学版)》
CAS
CSCD
北大核心
2017年第6期671-680,共10页
Journal of Fudan University:Natural Science
基金
教育部博士点专项科研基金(博导类)(20120071110018)
关键词
肺腺癌
早期诊断
甲基化
探针
广义线性模型
Lasso
lung adenocarcinoma
early stage diagnosis
methylation
probe
general linear model
Lasso