Early mortality risk prediction models for patients with sepsis-induced cardiorenal syndrome based on machine learning
10.3760/cma.j.cn441217-20211126-00113
- VernacularTitle:基于机器学习建立脓毒症心肾综合征患者早期死亡风险预测模型
- Author:
Yingying ZHANG
1
;
Yiguo LIU
;
Dan ZHAO
;
Zhenyu SHI
;
Chen YU
Author Information
1. 同济大学附属同济医院肾内科,上海 200065
- Keywords:
Machine learning;
Sepsis;
Cardio-renal syndrome;
Mortality risk
- From:
Chinese Journal of Nephrology
2022;38(9):785-793
- CountryChina
- Language:Chinese
-
Abstract:
Objective:To explore the method of constructing an early mortality risk prediction model for patients with sepsis-induced cardiorenal syndrome by machine learning algorithm, so as to provide a basis for early clinical identification of high-risk patients and accurate treatment.Methods:Patients with sepsis-induced cardiorenal syndrome from January 1, 2015 to May 31, 2019 in Tongji Hospital, Tongji University were enrolled. Basic characteristics, laboratory indexes, hospitality treatment and other relevant baseline data were collected. Thirty-day mortality was defined as the primary end-point event after the enrolled patients were diagnosed. Python software was applied to establish different machine learning models, and the area under the receiver -operating characteristic curve ( AUC) was used to evaluate the predictive value of models. Disease-related risk factors were selected according to the most optimal model. Importantly, visualized decision tree and semi-naive Bayesian (sNB) models were established to further explore the interrelationship between these risk factors. Results:A total of 340 patients were included, of whom 114 patients (33.5%) died within 30 days after diagnosis. The AUC of support vector machine (SVM), random forest (RF), gradient boosting decision tree (GBDT), extreme gradient boosting (XGBoost), and light gradient boosting machine (LGBM) prediction models were 0.652, 0.868, 0.870, 0.754, and 0.852, respectively. The AUC of GBDT model had the most efficiency to predict end-point events, and the prediction AUC value was better. According to the feature ranking of GBDT model, the relevant influencing factors were selected, including total sequential organ failure assessment (SOFA) score, neural SOFA score, vasoactive drug application, cardiac troponin I (cTNI), age, myoglobin, circulation system SOFA score, chronic kidney disease, heart rate and baseline serum creatinine. Visualized decision tree model had 4 layers, 15 nodes and 8 terminal nodes as evidenced by total SOFA score, myoglobin, baseline serum creatinine and age. The total SOFA score, change rate of myoglobin, serum creatinine and age were included into the visualized decision model. The AUC value of the model for predicting end-point event was 0.690. sNB model revealed complex correlation between the risk factors, in which neural SOFA score was related to total SOFA score, vasoactive drug application was related to total SOFA score, and cTNI was related to baseline serum creatinine. Conclusions:A risk prediction model for patients with sepsis-induced cardiorenal syndrome is established and the model showes that high SOFA score remains the primary risk factor for patients with sepsis-induced cardiorenal syndrome based machine learning. Visualized decision tree and sNB models help clinicians to further identify the dependence and logic relationship among these risk factors clearly and provide a novel method to predict mortality risk for patients with sepsis-induced cardiorenal syndrome.