BACKGROUND Intensive care unit(ICU)patients demand continuous monitoring of several clinical and laboratory parameters that directly influence their medical progress and the staff’s decision-making.Those data are vit...BACKGROUND Intensive care unit(ICU)patients demand continuous monitoring of several clinical and laboratory parameters that directly influence their medical progress and the staff’s decision-making.Those data are vital in the assistance of these patients,being already used by several scoring systems.In this context,machine learning approaches have been used for medical predictions based on clinical data,which includes patient outcomes.AIM To develop a binary classifier for the outcome of death in ICU patients based on clinical and laboratory parameters,a set formed by 1087 instances and 50 variables from ICU patients admitted to the emergency department was obtained in the“WiDS(Women in Data Science)Datathon 2020:ICU Mortality Prediction”dataset.METHODS For categorical variables,frequencies and risk ratios were calculated.Numerical variables were computed as means and standard deviations and Mann-Whitney U tests were performed.We then divided the data into a training(80%)and test(20%)set.The training set was used to train a predictive model based on the Random Forest algorithm and the test set was used to evaluate the predictive effectiveness of the model.RESULTS A statistically significant association was identified between need for intubation,as well predominant systemic cardiovascular involvement,and hospital death.A number of the numerical variables analyzed(for instance Glasgow Coma Score punctuations,mean arterial pressure,temperature,pH,and lactate,creatinine,albumin and bilirubin values)were also significantly associated with death outcome.The proposed binary Random Forest classifier obtained on the test set(n=218)had an accuracy of 80.28%,sensitivity of 81.82%,specificity of 79.43%,positive predictive value of 73.26%,negative predictive value of 84.85%,F1 score of 0.74,and area under the curve score of 0.85.The predictive variables of the greatest importance were the maximum and minimum lactate values,adding up to a predictive importance of 15.54%.CONCLUSION We demonstrated the efficacy of a Random Forest machine learning algorithm for handling clinical and laboratory data from patients under intensive monitoring.Therefore,we endorse the emerging notion that machine learning has great potential to provide us support to critically question existing methodologies,allowing improvements that reduce mortality.展开更多
文摘BACKGROUND Intensive care unit(ICU)patients demand continuous monitoring of several clinical and laboratory parameters that directly influence their medical progress and the staff’s decision-making.Those data are vital in the assistance of these patients,being already used by several scoring systems.In this context,machine learning approaches have been used for medical predictions based on clinical data,which includes patient outcomes.AIM To develop a binary classifier for the outcome of death in ICU patients based on clinical and laboratory parameters,a set formed by 1087 instances and 50 variables from ICU patients admitted to the emergency department was obtained in the“WiDS(Women in Data Science)Datathon 2020:ICU Mortality Prediction”dataset.METHODS For categorical variables,frequencies and risk ratios were calculated.Numerical variables were computed as means and standard deviations and Mann-Whitney U tests were performed.We then divided the data into a training(80%)and test(20%)set.The training set was used to train a predictive model based on the Random Forest algorithm and the test set was used to evaluate the predictive effectiveness of the model.RESULTS A statistically significant association was identified between need for intubation,as well predominant systemic cardiovascular involvement,and hospital death.A number of the numerical variables analyzed(for instance Glasgow Coma Score punctuations,mean arterial pressure,temperature,pH,and lactate,creatinine,albumin and bilirubin values)were also significantly associated with death outcome.The proposed binary Random Forest classifier obtained on the test set(n=218)had an accuracy of 80.28%,sensitivity of 81.82%,specificity of 79.43%,positive predictive value of 73.26%,negative predictive value of 84.85%,F1 score of 0.74,and area under the curve score of 0.85.The predictive variables of the greatest importance were the maximum and minimum lactate values,adding up to a predictive importance of 15.54%.CONCLUSION We demonstrated the efficacy of a Random Forest machine learning algorithm for handling clinical and laboratory data from patients under intensive monitoring.Therefore,we endorse the emerging notion that machine learning has great potential to provide us support to critically question existing methodologies,allowing improvements that reduce mortality.