2.Early Prediction of Mortality for Septic Patients Visiting Emergency Room Based on Explainable Machine Learning: A Real-World Multicenter Study
Sang Won PARK ; Na Young YEO ; Seonguk KANG ; Taejun HA ; Tae-Hoon KIM ; DooHee LEE ; Dowon KIM ; Seheon CHOI ; Minkyu KIM ; DongHoon LEE ; DoHyeon KIM ; Woo Jin KIM ; Seung-Joon LEE ; Yeon-Jeong HEO ; Da Hye MOON ; Seon-Sook HAN ; Yoon KIM ; Hyun-Soo CHOI ; Dong Kyu OH ; Su Yeon LEE ; MiHyeon PARK ; Chae-Man LIM ; Jeongwon HEO ; On behalf of the Korean Sepsis Alliance (KSA) Investigators
Journal of Korean Medical Science 2024;39(5):e53-
Background:
Worldwide, sepsis is the leading cause of death in hospitals. If mortality rates in patients with sepsis can be predicted early, medical resources can be allocated efficiently. We constructed machine learning (ML) models to predict the mortality of patients with sepsis in a hospital emergency department.
Methods:
This study prospectively collected nationwide data from an ongoing multicenter cohort of patients with sepsis identified in the emergency department. Patients were enrolled from 19 hospitals between September 2019 and December 2020. For acquired data from 3,657 survivors and 1,455 deaths, six ML models (logistic regression, support vector machine, random forest, extreme gradient boosting [XGBoost], light gradient boosting machine, and categorical boosting [CatBoost]) were constructed using fivefold cross-validation to predict mortality. Through these models, 44 clinical variables measured on the day of admission were compared with six sequential organ failure assessment (SOFA) components (PaO 2 /FIO 2 [PF], platelets (PLT), bilirubin, cardiovascular, Glasgow Coma Scale score, and creatinine).The confidence interval (CI) was obtained by performing 10,000 repeated measurements via random sampling of the test dataset. All results were explained and interpreted using Shapley’s additive explanations (SHAP).
Results:
Of the 5,112 participants, CatBoost exhibited the highest area under the curve (AUC) of 0.800 (95% CI, 0.756–0.840) using clinical variables. Using the SOFA components for the same patient, XGBoost exhibited the highest AUC of 0.678 (95% CI, 0.626–0.730). As interpreted by SHAP, albumin, lactate, blood urea nitrogen, and international normalization ratio were determined to significantly affect the results. Additionally, PF and PLTs in the SOFA component significantly influenced the prediction results.
Conclusion
Newly established ML-based models achieved good prediction of mortality in patients with sepsis. Using several clinical variables acquired at the baseline can provide more accurate results for early predictions than using SOFA components. Additionally, the impact of each variable was identified.