摘要
目的应用不同机器学习算法构建急性心力衰竭(acute heart failure,AHF)患者易损期死亡或再入院的风险预测模型,并筛选出最优模型。方法选取2019年10月至2021年7月于陆军军医大学第二附属医院心血管内科住院治疗的651例AHF患者为研究对象,收集入院生命体征、合并症和实验室检查结果等临床资料。复合终点事件定义为AHF患者出院后3个月内发生全因死亡或心衰加重再入院。采用简单随机抽样法将研究对象按8∶2拆分为训练集(521例)和测试集(130例),基于逻辑回归(logistic regression,LR)、随机森林(random forest,RF)、决策树(decision tree,DT)、轻量梯度提升(light gradient boosting machine,LGBM)、极端梯度提升(extreme gradient boosting,XGBoost)和神经网络(neural network,NN)6种机器学习算法分别构建预测模型。采用受试者工作特征曲线(receiver operating characteristic,ROC)和临床决策曲线(decision curve analysis,DCA)对模型的预测性能和临床获益进行评价,使用Shapley加性解释(Shapley additive explanation,SHAP)算法评估不同临床特征对模型的影响。结果651例AHF患者中发生复合终点事件203例(31.2%)。ROC曲线分析显示,LR、RF、DT、LGBM、XGBoost和NN模型的曲线下面积(area under curve,AUC)依次为0.707、0.756、0.616、0.677、0.768、0.681,XGBoost模型的AUC最高,DCA曲线中XGBoost模型的临床决策净获益也更大,整体预测效能最佳。SHAP算法分析得出,影响XGBoost模型输出结果的重要临床特征分别为血清尿酸、D-二聚体、平均动脉压、B型利钠肽、左房前后径、体质量指数和NYHA分级。结论XGBoost模型预测急性心力衰竭患者易损期死亡或再入院风险效果最佳。
Objective To construct risk prediction models of death or readmission in patients with acute heart failure(AHF)during the vulnerable phase based on machine learning algorithms and screen the optimal model.Methods A total of 651 AHF patients with admitted to Department of Cardiology of the Second Affiliated Hospital of Army Medical University from October 2019 to July 2021 were included.The clinical data consisting of admission vital signs,comorbidities and laboratory results were collected from electronic medical records.The composite endpoint was defined as all-cause death or readmission for worsening heart failure within 3 months after discharge.The patients were divided into a training set(521 patients)and a test set(130 patients)in a ratio of 8:2 through the simple random sampling.Six machine learning models were developed,including logistic regression(LR),random forest(RF),decision tree(DT),light gradient boosting machine(LGBM),extreme gradient boosting(XGBoost)and neural networks(NN).Receiver operating characteristic(ROC)curve and decision curve analysis(DCA)were used to evaluate the predictive performance and clinical benefit of the models.Shapley additive explanation(SHAP)was used to explain and evaluate the effect of different clinical characteristics on the models.Results A total of 651 AHF patients were included,of whom 203 patients(31.2%)died or were readmitted during the vulnerable phase.ROC curve analysis showed that the AUC values of the LR,RF,DT,LGBM,XGBoost and NN model were 0.707,0.756,0.616,0.677,0.768 and 0.681,respectively.The XGBoost model had the highest AUC value.DCA showed that the XGBoost model exhibited greater clinical net benefit compared with other models,with the best predictive performance.SHAP algorithm analysis showed that the clinical features that had the greatest impact on the output of the model were serum uric acid,D-dimer,mean arterial pressure,B-type natriuretic peptide,left atrial diameter,body mass index,and New York Heart Association(NYHA)classification.Conclusion The XGBoost model has the best predictive performance in predicting the risk of death or readmission of AHF patients during the vulnerable phase.
作者
曾竟
何小龙
胡华娟
罗晓宇
郭志念
陈运龙
王敏
王江
ZENG Jing;HE Xiaolong;HU Huajuan;LUO Xiaoyu;GUO Zhinian;CHEN Yunlong;WANG Min;WANG Jiang(Department of Cardiology,PLA Institute of Cardiovascular Diseases,Second Affiliated Hospital,Army Medical University(Third Military Medical University),Chongqing,400037;School of Statistics,Southwestern University of Finance and Economics,Chengdu,Sichuan Province,611130,China)
出处
《陆军军医大学学报》
CAS
CSCD
北大核心
2024年第7期738-745,共8页
Journal of Army Medical University
基金
重庆市科卫联合医学重点项目(2023ZDXM035)。