摘要
目的探讨基于随机森林(random forest,RF)构建的疑似感染患者早期脓毒症风险预警模型的应用价值。方法采用重症监护医疗数据集-Ⅲ(MIMIC-Ⅲ)作为数据源,提取疑似感染患者的一般情况、生命体征、实验室检查等临床特征。将入选数据集按0.7:0.3随机分为训练组与内部验证组。训练组分别基于RF及Logistic回归(LR)建立脓毒症风险预警模型,根据受试者工作特征曲线下面积(AUC)对2种模型予以比较,进而评估RF模型的预警效果;通过内部验证组对RF模型进行内部验证。选取2019年1月-2020年1月海南医学院第二附属医院急诊科及重症监护病房收治的206例疑似感染患者的临床资料作为外部验证组,对RF模型的区分度及校准度进行外部验证。结果RF模型各变量重要性评分中,年龄、平均动脉压、心率、血红蛋白、血小板、血肌酐、血尿素氮及淋巴细胞数这8个变量得分较高,该模型灵敏度为65.8%,特异度为84.1%,AUC为0.830,95%CI:0.811~0.848。LR模型最终纳入的影响因素为年龄、血肌酐、血尿素氮及C-反应蛋白,该模型灵敏度为58.2%,特异度为59.8%,AUC为0.620,95%CI:0.596~0.643。内部验证组RF模型的AUC为0.812(95%CI:0.774~0.850),且与训练组RF模型的AUC相近。外部验证组RF模型的一致性指数与AUC均为0.825(95%CI:0.762~0.888),同时与训练组RF模型的AUC高度相近,区分度良好;Bootstrap自抽样法校准曲线显示,RF模型预测脓毒症发生风险与实际脓毒症发生风险平均绝对误差为0.022,校准度良好。结论RF脓毒症风险预警模型具有较佳的预测性能,且稳定性、有效性、准确性及可行性均较高,具有一定的临床应用价值。
Objective To discuss the application value of early risk warning model of sepsis for the patients with sus⁃pected infection based on random forest(RF).Methods The general conditions,vital signs and laboratory examinations of the patients with suspected infection were extracted from the Medical Information Mart for Intensive CareⅢ(MIM⁃IC-Ⅲ).According to 0.7:0.3,the dataset was separated into the training set and internal validation set randomly.Based on RF and Logistic regression(LR)respectively to build an early risk warning model for sepsis through the training set.The two models were compared according to the area under the receiver operating characteristic curve(AUC),and to evaluate the warning efficiency of the RF model further.The RF model was verified internally through the internal vali⁃dation set.The clinical data of 206 patients with suspected infection admitted to the emergency department and inten⁃sive care unit of the Second Affiliated Hospital of Hainan Medical University,from January 2019 to January 2020,were collected as external verification set,and the discrimination and calibration of the RF model were verified externally.Results In RF model,the eight variables of age,mean arterial pressure,heart rate,hemoglobin,platelets,serum creatinine,blood urea nitrogen and lymphocytes have higher scores,the sensitivity of the model was 65.8%,the specificity was 84.1%,the AUC was 0.830,and the 95%CI:0.811-0.848.The influencing factors of the LR model finally incorporated were age,serum creatinine,blood urea nitrogen and C-reactive protein,the sensitivity of the model was 58.2%,the speci⁃ficity was 59.8%,the AUC was 0.620,and the(95%CI:0.596~0.643).The AUC of the RF model in the internal valida⁃tion set was 0.812(95%CI:0.774~0.850),which was close to the AUC of the RF model in the training set.Both the C-in⁃dex and AUC of the RF model in the external validation set were 0.825(95%CI:0.762~0.888),and highly close to the AUC of the RF model in the training set,which suggested that the RF model had good discrimination.And the Bootstrap self-sampling calibration curve showed that the mean absolute error between the predict sepsis risk and the actual sep⁃sis risk of the RF model was 0.022,which suggested that the RF model had good accuracy.Conclusion The RF model of risk warning for sepsis not only has a better prediction performance,but the stability,effectiveness,accuracy,and feasi⁃bility were also higher,which has certain clinical application value.
作者
由媛丽
邢柏
YOU Yuanli;XING Bo(Department of Emergency,the Second Affiliated Hospital of Hainan Medical University,Haikou Hainan 570311,China)
出处
《中国急救复苏与灾害医学杂志》
2021年第12期1375-1380,1385,共7页
China Journal of Emergency Resuscitation and Disaster Medicine
基金
海南省自然科学基金资助课题(编号:819MS128)
海南省研究生创新科研课题(编号:Hys2019-306)。