期刊文献+

基于LightGBM模型的肺腺癌免疫相关基因筛选与患者生存率预测

Screening of immune related gene and survival prediction of lung adenocarcinoma patients based on LightGBM model
原文传递
导出
摘要 肺癌是对人类健康威胁最大的恶性肿瘤之一。已有研究表明,一些基因在肺癌的发生发展过程中发挥着重要的调控作用。本文提出一种基于LightGBM的集成学习方法,根据免疫相关基因(IRG)表达谱数据和临床数据构建预后模型,对肺腺癌患者的预后生存率进行预测。首先,使用Limma包进行基因差异分析,然后利用CoxPH回归分析方法对与预后相关的IRG进行筛选,进而使用XGBoost算法对IRG特征进行重要性打分,最后利用LASSO回归分析方法筛选可用于构建预后模型的IRG,最终结果共得到17个可用于构建模型的IRG特征。根据筛选得到的IRG特征来训练LightGBM,使用K-means算法将患者分为三组,其模型输出结果的受试者操作特征(ROC)曲线下面积(AUC)显示模型预测三组患者生存率的准确率分别为96%、98%、96%。实验结果表明,本文所提模型能够将肺腺癌患者分为三组[5年生存率高于65%(第一组)、低于65%但高于30%(第二组)、低于30%(第三组)],并能较准确地预测肺腺癌患者的五年生存率。 Lung cancer is one of the malignant tumors with the greatest threat to human health,and studies have shown that some genes play an important regulatory role in the occurrence and development of lung cancer.In this paper,a LightGBM ensemble learning method is proposed to construct a prognostic model based on immune relate gene(IRG)profile data and clinical data to predict the prognostic survival rate of lung adenocarcinoma patients.First,this method used the Limma package for differential gene expression,used CoxPH regression analysis to screen the IRG to prognosis,and then used XGBoost algorithm to score the importance of the IRG features.Finally,the LASSO regression analysis was used to select IRG that could be used to construct a prognostic model,and a total of 17 IRG features were obtained that could be used to construct model.LightGBM was trained according to the IRG screened.The K-means algorithm was used to divide the patients into three groups,and the area under curve(AUC)of receiver operating characteristic(ROC)of the model output showed that the accuracy of the model in predicting the survival rates of the three groups of patients was 96%,98%and 96%,respectively.The experimental results show that the model proposed in this paper can divide patients with lung adenocarcinoma into three groups[5-year survival rate higher than 65%(group 1),lower than 65%but higher than 30%(group 2)and lower than 30%(group 3)]and can accurately predict the 5-year survival rate of lung adenocarcinoma patients.
作者 孟祥福 田友发 张霄雁 MENG Xiangfu;TIAN Youfa;ZHANG Xiaoyan(School of Electronics and Information Engineering,Liaoning Technical University,Huludao,Liaoning 125000,P.R.China)
出处 《生物医学工程学杂志》 EI CAS 北大核心 2024年第1期70-79,共10页 Journal of Biomedical Engineering
基金 国家自然科学基金(61772249) 辽宁省教育厅科学研究项目(LJ2019QL017,LJKZ0355)。
关键词 肺腺癌 生物信息学 集成学习 免疫相关基因 LightGBM Lung adenocarcinoma Bioinformatics Ensemble learning Immune related gene LightGBM
  • 相关文献

参考文献13

二级参考文献43

共引文献35

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部