Predicting prolonged length of intensive care unit stay via machine learning.
- Author:
Jing Yi WU
1
;
Yu LIN
1
;
Ke LIN
1
;
Yong Hua HU
1
;
Gui Lan KONG
2
Author Information
1. Department of Epidemiology and Biostatistics, Peking University School of Public Health, Beijing 100191, China.
2. Advanced Institute of Information Technology, Peking University, Hangzhou 311200, China.
- Publication Type:Journal Article
- Keywords:
Intensive care units;
Length of stay;
Machine learning;
Random forest;
Simplified acute physiology score
- MeSH:
Aged;
Humans;
Intensive Care Units;
Machine Learning;
Male;
Middle Aged;
Research Design
- From:
Journal of Peking University(Health Sciences)
2021;53(6):1163-1170
- CountryChina
- Language:Chinese
-
Abstract:
OBJECTIVE:To construct length of intensive care unit (ICU) stay (LOS-ICU) prediction models for ICU patients, based on three machine learning models support vector machine (SVM), classification and regression tree (CART), and random forest (RF), and to compare the prediction perfor-mance of the three machine learning models with the customized simplified acute physiology score Ⅱ(SAPS-Ⅱ) model.
METHODS:We used medical information mart for intensive care (MIMIC)-Ⅲ database for model development and validation. The primary outcome was prolonged LOS-ICU(pLOS-ICU), defined as longer than the third quartile of patients' LOS-ICU in the studied dataset. The recursive feature elimination method was used to do feature selection for three machine learning models. We utilized 5-fold cross validation to evaluate model prediction performance. The Brier value, area under the receiver operation characteristic curve (AUROC), and estimated calibration index (ECI) were used as perfor-mance measures. Performances of the four models were compared, and performance differences between the models were assessed using two-sided t test. The model with the best prediction performance was employed to generate variable importance ranking, and the identified top five important predictors were pre-sented.
RESULTS:The final cohort in our study consisted of 40 200 eligible ICU patients, of whom 23.7% were with pLOS-ICU. The proportion of the male patients was 57.6%, and the age of all the ICU patients was (61.9±16.5) years.Results showed that the three machine learning models outperformed the customized SAPS-Ⅱ model in terms of all the performance measures with statistical significance (P < 0.01). Among the three machine learning models, the RF model achieved the best overall performance (Brier value, 0.145), discrimination (AUROC, 0.770) and calibration (ECI, 7.259). The calibration curve showed that the RF model slightly overestimated the risk of pLOS-ICU in high-risk ICU patients, but underestimated the risk of pLOS-ICU in low-risk ICU patients. Top five important predictors for pLOS-ICU identified by the RF model included age, heart rate, systolic blood pressure, body tempe-rature, and ratio of arterial oxygen tension to the fraction of inspired oxygen(PaO2/FiO2).
CONCLUSION:The RF algorithm-based pLOS-ICU prediction model had a best prediction performance in this study. It lays a foundation for future application of the RF-based pLOS-ICU prediction model in ICU clinical practice.