摘要
目的:开发并验证一种基于CT影像组学及形态学特征对非小细胞肺癌患者预后生存时间范围进行预测的机器学习分类模型。方法:在癌症影像数据库(TCIA)中下载lung1数据集,选取符合条件的243例周围型非小细胞肺癌患者,根据截止生存时间将患者分为两组(1组为生存期≤3年,2组为生存期>3年)。在每个病灶中提取1037个影像组学特征,采用最小绝对收缩和选择算子(LASSO)算法进行特征筛选。记录每个病灶的形态学特征,运用t检验和卡方检验进行筛选。将两者结合起来,运用Logistic回归、随机森林、AdaBoost、高斯朴素贝叶斯、神经网络这5种机器学习分类方法建立预测模型,采用受试者工作特征(ROC)曲线评价5种预测模型的效能,并选出最优模型。最后使用广州中医院大学第一附属医院搜集的77例患者数据进行外部验证。结果:高斯朴素贝叶斯分类预测模型是本研究中最好的模型,稳定性相对较好,在所有模型中,运用此模型的AUC值在训练集和验证集中均较高。经过外部验证,该模型在训练集的AUC值为0.735,敏感度为0.685,特异度为0.700;测试集AUC值为0.771,敏感度为0.571,特异度为0.898。结论:CT影像组学结合形态学特征的机器学习分类模型能较准确地对NSCLC患者的预后生存时间范围进行预测。
Objective:To develop and validate a machine learning classification model for prediction of prognostic survival time range in patients with non-small cell lung cancer(NSCLC)based on CT radiomics and morphological features.Methods:The lung1 dataset was downloaded from the cancer imaging archive(TCIA),and 243 eligible patients with peripheral NSCLC were selected and divided into two groups according to the cut-off survival time(group-1≤3 years and group-2>3 years).1037 radiomics features were extracted in each lesion,and feature screening was performed using the least absolute shrinkage and selection operator(LASSO)algorithm.The morphological characteristics of each lesion were recorded and screened using t-test and chi square test.By combining the two methods,the prediction model was built by five machine learning classification methods,including logistic regression,random forest classifier,AdaBoost classifier,Gaussian NB and MLP Classifier.Then,receiver operating characteristic curve(ROC)was used to evaluate the effectiveness of five predictive models,and the optimal model was selected.Finally,external validation was conducted using the data of 77 patients collected from The First Affiliated Hospital of Guangzhou University of Chinese Medicine.Results:The Gaussian NB classification prediction model was the best model in this experiment,with relatively good stability.Among all models,the AUC value of this model was relatively high in both the training and validation sets.After external validation,the AUC value of this model in the training set was 0.735,with sensitivity of 0.685 and specificity of 0.700,and the AUC value of this model in the test set was 0.771,with sensitivity of 0.571 and specificity of 0.898.Conclusion:The machine learning classification model based on CT radiomics combined with morphological features can predict the prognostic survival time range of NSCLC patients more accurately.
作者
周洁
郑燕婷
江舒琪
安杰
邱士军
陈淮
ZHOU Jie;ZHENG Yan-ting;JIANG Shu-qi(Department of Radiology,The First Affiliated Hospital of Guangzhou University of Chinese Medicine,Guangzhou 510405,China)
出处
《放射学实践》
CSCD
北大核心
2024年第5期622-628,共7页
Radiologic Practice
基金
广东省基础与应用基础研究基金(编号:2022A1515011028)。