摘要
BACKGROUND With the recent change in the definition(Sepsis-3 Definition)of sepsis and septic shock,an electronic search algorithm was required to identify the cases for data automation.This supervised machine learning method would help screen a large amount of electronic medical records(EMR)for efficient research purposes.AIM To develop and validate a computable phenotype via supervised machine learning method for retrospectively identifying sepsis and septic shock in critical care patients.METHODS A supervised machine learning method was developed based on culture orders,Sequential Organ Failure Assessment(SOFA)scores,serum lactate levels and vasopressor use in the intensive care units(ICUs).The computable phenotype was derived from a retrospective analysis of a random cohort of 100 patients admitted to the medical ICU.This was then validated in an independent cohort of 100 patients.We compared the results from computable phenotype to a gold standard by manual review of EMR by 2 blinded reviewers.Disagreement was resolved by a critical care clinician.A SOFA score≥2 during the ICU stay with a culture 72 h before or after the time of admission was identified.Sepsis versions as V1 was defined as blood cultures with SOFA≥2 and Sepsis V2 was defined as any culture with SOFA score≥2.A serum lactate level≥2 mmol/L from 24 h before admission till their stay in the ICU and vasopressor use with Sepsis-1 and-2 were identified as Septic Shock-V1 and-V2 respectively.RESULTS In the derivation subset of 100 random patients,the final machine learning strategy achieved a sensitivity-specificity of 100%and 84%for Sepsis-1,100%and 95%for Sepsis-2,78%and 80%for Septic Shock-1,and 80%and 90%for Septic Shock-2.An overall percent of agreement between two blinded reviewers had a k=0.86 and 0.90 for Sepsis 2 and Septic shock 2 respectively.In validation of the algorithm through a separate 100 random patient subset,the reported sensitivity and specificity for all 4 diagnoses were 100%-100%each.CONCLUSION Supervised machine learning for identification of sepsis and septic shock is reliable and an efficient alternative to manual chart review.
BACKGROUND With the recent change in the definition(Sepsis-3 Definition) of sepsis and septic shock, an electronic search algorithm was required to identify the cases for data automation. This supervised machine learning method would help screen a large amount of electronic medical records(EMR) for efficient research purposes.AIM To develop and validate a computable phenotype via supervised machine learning method for retrospectively identifying sepsis and septic shock in critical care patients.METHODS A supervised machine learning method was developed based on culture orders,Sequential Organ Failure Assessment(SOFA) scores, serum lactate levels and vasopressor use in the intensive care units(ICUs). The computable phenotype was derived from a retrospective analysis of a random cohort of 100 patients admitted to the medical ICU. This was then validated in an independent cohort of 100 patients. We compared the results from computable phenotype to a gold standard by manual review of EMR by 2 blinded reviewers. Disagreement was resolved by a critical care clinician. A SOFA score ≥ 2 during the ICU stay with a culture 72 h before or after the time of admission was identified. Sepsis versions as V1 was defined as blood cultures with SOFA ≥ 2 and Sepsis V2 was defined as any culture with SOFA score ≥ 2. A serum lactate level ≥ 2 mmol/L from 24 h before admission till their stay in the ICU and vasopressor use with Sepsis-1 and-2 were identified as Septic Shock-V1 and-V2 respectively.RESULTS In the derivation subset of 100 random patients, the final machine learning strategy achieved a sensitivity-specificity of 100% and 84% for Sepsis-1, 100% and95% for Sepsis-2, 78% and 80% for Septic Shock-1, and 80% and 90% for Septic Shock-2. An overall percent of agreement between two blinded reviewers had a k= 0.86 and 0.90 for Sepsis 2 and Septic shock 2 respectively. In validation of the algorithm through a separate 100 random patient subset, the reported sensitivity and specificity for all 4 diagnoses were 100%-100% each.CONCLUSION Supervised machine learning for identification of sepsis and septic shock is reliable and an efficient alternative to manual chart review.