摘要
目的基于Logistic回归与随机森林算法构建甲状腺乳头状癌(PTC)颈部淋巴结转移(LNM)的预测模型,比较二者的诊断效能。方法选取我院收治的PTC患者156例,依据是否发生颈部LNM分为非转移组65例和转移组91例,比较两组超声、基因检测及临床检查结果的差异。采用多因素Logistic回归分析筛选PTC颈部LNM的独立影响因素,基于Logistic回归和随机森林算法分别构建PTC颈部LNM的预测模型;绘制受试者工作特征(ROC)曲线分析模型的诊断效能。结果转移组与非转移组年龄、结节最大径、甲状腺球蛋白抗体(TgAb)水平,以及甲状腺外浸润(ETE)、BRAFV600E基因突变、微钙化占比比较差异均有统计学意义(均P<0.05)。多因素Logistic回归分析显示,微钙化、结节最大径、ETE、年龄、TgAb水平、BRAFV600E基因突变均为PTC颈部LNM的独立影响因素(均P<0.05)。ROC曲线分析显示,Logistic回归模型预测PTC颈部LNM的曲线下面积(AUC)为0.763;随机森林模型显示,树的数目为272时错误率最低,模型预测PTC颈部LNM的相对重要预测因子排序依次为TgAb水平、BRAFV600E基因突变、微钙化、年龄、ETE、结节最大径,其预测PTC颈部LNM的AUC为0.856,高于Logistic回归模型(Z=2.812,P=0.005)。结论基于随机森林算法构建的PTC颈部LNM预测模型的诊断效能高于基于Logistic回归构建的预测模型,临床医师可根据PTC患者颈部LNM的随机森林重要性排序制定合适的干预措施。
Objective To construct a predictive models of cervical lymph node metastasis(LNM)in papillary thyroid carcinoma(PTC)based on Logistic regression and random forest algorithm,and to compare their diagnostic efficacy.Methods A total of 156 PCT patients diagnosed and treated in our hospital were selected and divided into non-metastatic group(n=65)and metastatic group(n=91)according to the presence of cervical LNM.The differences in ultrasound features,genetic tests and clinical data were compared between the two groups.Multivariate Logistic regression was used to screen the independent influencing factors of cervical LNM in PTC.Predictive models of cervical LNM in PTC were constructed based on Logistic regression and random forest algorithms,respectively.And their diagnostic efficacy was analyzed by receiver operating characteristic(ROC)curve.Results There were statistically significant differences in age,maximum diameter of nodule,thyroglobulin antibody(TgAb)level,and proportion of extra thyroidal extension(ETE),BRAFV600E gene mutation,microcalcification between the two groups(all P<0.05).The multivariate Logistic regression analysis showed that microcalcification,maximum nodule diameter of nodule,ETE,age,TgAb level,BRAFV600E gene mutation were independent influencing factors for cervical LNM in PTC(all P<0.05).ROC curve analysis showed that the area under the curve(AUC)of the Logistic regression model in predicting cervical LNM in PTC was 0.763.The random forest model showed the lowest error rate when the number of trees was 272.And the rank order of the relatively important predictors for cervical LNM in PTC were as follows:TgAb level,BRAFV600E gene mutation,microcalcification,age,ETE,and maximum diameter of nodule.And the AUC of the random forest model in predicting cervical LNM in PTC was 0.856,which was higher than that of the Logistic regression model(Z=2.812,P=0.005).Conclusion The diagnostic efficacy of the predictive model of cervical LNM in PTC based on random forest algorithm is higher than that based on Logistic regression.Clinicians can develop rational interventions for PTC patients according to the randomized forest importance ranking of the occurrence of cervical LNM.
作者
冀波
沈冬花
赵冬雪
黄晓旭
周丽娜
王龚
JI Bo;SHEN Donghua;ZHAO Dongxue;HUANG Xiaoxu;ZHOU Li’na;WANG Gong(Department of Ultrasound Diagnosis,PLA Rocket Force Characteristic Medical Center,Beijing 100088,China)
出处
《临床超声医学杂志》
CSCD
2024年第9期735-740,共6页
Journal of Clinical Ultrasound in Medicine