摘要
目的基于不同机器学习算法构建缺血性脑卒中伴吞咽障碍患者卒中相关性肺炎(SAP)预测模型。方法收集2022年6月至2024年2月收治的620例缺血性脑卒中伴吞咽障碍患者的临床资料,采用最小绝对收缩和选择算子(LASSO)回归分析患者发生SAP的影响因素,按7∶3比例将患者随机分为训练集(n=434)和测试集(n=186),训练集基于机器学习算法分别构建极端梯度提升(XGBoost)、Logistic回归、随机森林(RF)、决策树(DT)、支持向量随机(SVM)、K最近邻(KNN)等6种预测模型,对各模型性能进行评估,测试集进行验证,Shapley加法解释(SHAP)对各变量因素进行重要性排序。结果采用LASSO回归筛选出8个变量特征,以此构建的XGBoost、Logistic回归、RF、DT、SVM、KNN等6个预测模型中,Logistic回归模型预测性能最优,在测试集中AUC为0.872,灵敏度为0.787,特异度为0.839,在校准曲线中辨别能力与DCA曲线中临床效益表现良好。SHAP图显示SAP发生的影响因素重要性排序分别为NIHSS评分、Barthel指数、吸烟史、年龄、洼田饮水试验分级、高血压、糖尿病、留置胃管。结论基于Logistic回归构建的缺血性脑卒中伴吞咽障碍患者发生SAP的预测模型具有良好的预测效果,可作为筛查SAP的辅助工具,为识别高风险人群提供便利。
Objective To create a prediction model of stroke-associated pneumonia(SAP)in ischemic stroke patients with dysphagia by machine learning algorithms.Methods Clinical data of 620 ischemic stroke patients with dysphagia admitted to our hospital from June 2022 to February 2024 were collected.The minimum absolute contraction and selection operator(LASSO)regression was used to analyze the influencing factors for SAP.Patients were randomly assigned into a training set(n=434)and a testing set(n=186)at a 7∶3 ratio.Using data in the training set,six prediction models were constructed by machine learning algorithms of the XGBoost(distributed gradient boosting),Logistic regression,random forest(RF),decision trees(DT),support vector machines(SVM)and K-nearest neighbors(KNN).The performance of each model was evaluated and validated in the testing set.Shapley addition interpretation(SHAP)was used to rank the importance of each variable factor.Results Eight variable features were selected by LASSO regression.Among the six prediction models created by XG Boost,Logistic regression,RF,DT,SVM and KNN,the Logistic regression model had the best prediction performance,with an area under the curve(AUC)of 0.872,sensitivity of 0.787 and specificity of 0.839 in the testing set.The discriminability in calibration curve and the clinical benefit in decision curve analysis(DCA)curve were good.The SHAP chart showed that the importance order of influencing factors for SAP was listed as follows:the National Institutes of Health Stroke Scale(NIHSS)score,Barthel index,smoking history,age,grade of Water Swallow Test,hypertension,diabetes and intubation of a gastric tube.Conclusion The SAP prediction model for ischemic stroke patients with dysphagia constructed based on Logistic regression has a good predictive performance.It can be used as an auxiliary tool for screening SAP,thus providing convenience for identifying high-risk populations.
作者
李雅楠
朱明芳
杨孟丽
冯英璞
赵瑞
李璐璐
叶林
杨梦园
王媛
LI Ya-nan;ZHU Ming-fang;YANG Meng-li;FENG Ying-pu;ZHAO Rui;LI Lu-lu;YE Lin;YANG Meng-yuan;WANG Yuan
出处
《中国疗养医学》
2024年第10期1-7,共7页
Chinese Journal of Convalescent Medicine
基金
河南省医学教育研究项目(Wjlx2022013)。