Prediction of intensive care unit readmission for critically ill patients based on ensemble learning.
- Author:
Yu LIN
1
;
Jing Yi WU
2
;
Ke LIN
1
;
Yong Hua HU
3
;
Gui Lan KONG
1
Author Information
1. National Institute of Health Data Science, Peking University, Beijing 100191, China.
2. Advanced Institute of Information Technology, Peking University, Hangzhou 311215, China.
3. Department of Epidemiology and Biostatistics, Peking University School of Public Health, Beijing 100191, China.
- Publication Type:Journal Article
- Keywords:
Intensive care units;
Machine learning;
Patient readmission;
Predictive value of tests
- MeSH:
Critical Illness;
Humans;
Intensive Care Units;
Machine Learning;
Patient Readmission;
ROC Curve
- From:
Journal of Peking University(Health Sciences)
2021;53(3):566-572
- CountryChina
- Language:Chinese
-
Abstract:
OBJECTIVE:To develop machine learning models for predicting intensive care unit (ICU) readmission using ensemble learning algorithms.
METHODS:A publicly accessible American ICU database, medical information mart for intensive care (MIMIC)-Ⅲ as the data source was used, and the patients were selected by the inclusion and exclusion criteria. A set of variables that had the predictive ability of outcome including demographics, vital signs, laboratory tests, and comorbidities of patients were extracted from the dataset. We built the ICU readmission prediction models based on ensemble learning methods including random forest, adaptive boosting (AdaBoost), and gradient boosting decision tree (GBDT), and compared the prediction performance of the machine learning models with a conventional Logistic regression model. Five-fold cross validation was used to train and validate the prediction models. Average sensitivity, positive prediction value, negative prediction value, false positive rate, false negative rate, area under the receiver operating characteristic curve (AUROC) and Brier score were used as performance measures. After constructing the prediction models, top 10 predictive variables based on importance ranking were identified by the model with the best discrimination.
RESULTS:Among these ICU readmission prediction models, GBDT (AUROC=0.858) had better performance than random forest (AUROC=0.827), and was slightly superior to AdaBoost (AUROC=0.851) in terms of AUROC. Compared with Logistic regression (AUROC=0.810), the discrimination of the three ensemble learning models was much better. The feature importance provided by GBDT showed that the top ranking variables included vital signs and laboratory tests. The patients with ICU readmission had higher mean arterial pressure, systolic blood pressure, diastolic blood pressure, and heart rate than the patients without ICU readmission. Meanwhile, the patients readmitted to ICU experienced lower urine output and higher serum creatinine. Overall, the patients having repeated admissions during their hospitalization showed worse heart function and renal function compared with the patients without ICU readmission.
CONCLUSION:The ensemble learning based ICU readmission prediction models had better performance than Logistic regression model. Such ensemble learning models have the potential to aid ICU physicians in identifying those patients with high risk of ICU readmission and thus help improve overall clinical outcomes.