摘要
探讨肺部肿瘤浸润性的独立危险因素,建立预测分类模型,并针对现下肺部肿瘤预测模型多使用传统Logistic回归方法,容易导致拟合过度并且预测准确度有限的问题,提出了基于随机森林(RF)算法,建立肺部肿瘤浸润性的风险预测模型,与Logistic回归、决策树(CART)、支持向量机(SVM)、XGBoost从模型区分度、模型校准度和临床适用度3方面进行比较,结果表明,随机森林分类器对于肺部肿瘤浸润性数据集有着更高的预测准确性、一致性和临床适用性。对变量特征重要性进行分析发现,恶性概率对判断肺部肿瘤浸润性有着显著影响,对于促进早发现、早诊断、早治疗肺癌有着重要应用价值。
This paper explores the independent risk factors of lung tumor infiltration and establishes a prediction classification model.To solve the problem that the current lung tumor prediction model is prone to over-fitting and limited prediction accuracy due to the use of traditional logistic regression method,a risk prediction model of lung tumor infiltration based on random forest(RF)algorithm was proposed,which is compared with the models with logistic regression,CART decision tree,SVM support vector machine,and XGBoost in terms of model differentiation,model calibration,and clinical applicability.Results show that random forest classifier has higher prediction accuracy,consistency,and clinical applicability for lung tumor infiltration dataset.Through the analysis of the importance of variable characteristics,it is found that the malignant probability has a significant impact on the determination of lung tumor infiltration,which has important application value for promoting early detection,diagnosis and treatment of lung cancer.
作者
朱萍
周涛
赵奔英
唐慧
夏开建
ZHU Ping;ZHOU Tao;ZHAO Benying;TANG Hui;XIA Kaijian(Intelligent Medical Technology Research Center,Changshu Hospital Affiliated to Soochow University(The First People's Hospital of Changshu),Changshu 215500,Jiangsu Province,China;Department of Thoracic Surgery,Changshu Hospital Affiliated to Soochow University(The First People's Hospital of Changshu),Changshu 215500,Jiangsu Province,China;Department of Pharmacy,Changshu Hospital Affiliated to Soochow University(The First People's Hospital of Changshu),Changshu 215500,Jiangsu Province,China)
出处
《中国数字医学》
2023年第11期90-96,共7页
China Digital Medicine
基金
苏州市重点扶持学科卫生信息学资助项目(SZFCXK202147)
苏州市临床重点病种诊疗技术专项项目(LCZX202124)。
关键词
肺部肿瘤浸润性
随机森林
机器学习
变量特征评价
Lung tumor infiltration
Random forest
Machine learning
Variable feature evaluation