摘要
目的本研究旨在构建预测全髋关节置换术(THA)患者下肢深静脉血栓(DVT)风险的机器学习模型,并利用shapley additive explanations(SHAP)方法分析影响DVT风险的关键因素。方法回顾性分析2017年1月1日至2022年7月31日在温州市某医院接受THA患者的数据,按4∶1随机分为训练集和测试集。采用递归特征消除法及5折交叉验证选取最佳特征;运用6种机器学习算法开发预测模型,并用多种指标评估性能;SHAP方法用于对最优模型进行可解释性分析。结果共416例THA患者纳入最终研究,其中训练集患者333例,测试集患者83例。XGBoost模型在测试集上表现最佳,其敏感度为0.817,特异度为0.783,F1分数为0.860,ROC-AUC为0.800,Brier评分为0.106。SHAP摘要图显示,影响THA术后DVT发生的前5位最重要因素依次为年龄、胆固醇、术后卧床时间、纤维蛋白原、术前血浆D-二聚体浓度。SHAP值的特征依赖图显示,年龄、胆固醇、术后卧床时间以及纤维蛋白原对THA患者DVT的影响均呈现出复杂的非线性关系,其中年龄、术后卧床时间和纤维蛋白原与DVT风险之间表现为倒“U”型关联,而胆固醇呈正相关。单样本特征影响SHAP从个体角度解释了各预测因素对其发生DVT的影响。结论本研究开发了高效可解释的预测THA患者DVT风险的机器学习模型,有助于临床识别高风险患者并给予个性化干预。
Objective To construct a machine learning model to predict the risk of deep venous thrombosis(DVT)in patients undergoing total hip arthroplasty(THA),and to identify key risk factors influencing DVT using shapley additive explanations(SHAP)method.Methods We retrospectively analyzed data from 416 patients who underwent THA in Wenzhou People′s Hospital from January 1,2017 to July 31,2022,and randomly divided them into a training set and a test set in a 4∶1 ratio.Recursive feature elimination and five-fold cross-validation were used to select the best features.Six machine learning algorithms were utilized to develop predictive models,and various performance metrics were employed to evaluate them.The SHAP method was used to analyze the interpretability of the optimal model.Results Four hundred and sixteen patients were included in the final study,including 333 in the training set and 83 in the test set.The XGBoost model was the most accurate on the test dataset,achieving a sensitivity of 0.817,specificity of 0.783,F 1 score of 0.860,ROC-AUC of 0.800,and a Brier score of 0.106.SHAP summary plots showed that age,cholesterol,postoperative bed time,fibrinogen,and preoperative plasma D-dimer levels were the top five determinants for post-THA DVT.SHAP values feature dependence plots revealed complex non-linear effects of these factors on DVT risk,with age,bed rest,and fibrinogen showing an inverted U-shaped relationship,and cholesterol displaying a positive correlation.Individual SHAP values offered insights into each predictor′s role in DVT risk.Conclusion This study developed an efficient and interpretable machine learning model to predict DVT risk in THA patients,which is helpful for clinical health professionals in identifying high-risk patients and providing personalized intervention.
作者
徐青
余冰
周佩敏
戴沈洁
董晓敏
Xu Qing;Yu Bing;Zhou Peimin;Dai Shenjie;Dong Xiaomin(Department of Orthopedics,Wenzhou People's Hospital,Wenzhou 325000,China)
出处
《中国医院统计》
2024年第1期11-18,24,共9页
Chinese Journal of Hospital Statistics
关键词
全髋关节置换术
深静脉血栓
机器学习
预测模型
模型解释
total hip arthroplasty
deep venous thrombosis
machine learning
predictive model
model interpretation