Machine learning-based optimizing clinical prediction model for 28-day mortality in patients with sepsis
10.3969/j.issn.1008-9691.2024.06.003
- VernacularTitle:基于机器学习的脓毒症患者早期生存预测模型构建
- Author:
Yan ZHUANG
1
;
Linfeng DAI
1
;
Haidong ZHANG
1
;
Qiuhua CHEN
1
;
Qingfang NIE
1
;
Wenjing DU
1
;
Yan YANG
1
Author Information
1. 南京中医药大学附属医院重症医学科,江苏 南京 210029
- Publication Type:Journal Article
- Keywords:
Sepsis;
Risk factor;
28-day mortality;
Clinical prediction model;
Machine learning algorithm
- From:
Chinese Journal of Integrated Traditional and Western Medicine in Intensive and Critical Care
2024;31(6):653-658
- CountryChina
- Language:Chinese
-
Abstract:
Objective To investigate the risk factors of 28-day mortality in septic patients and develop optimizing clinical prediction model based on machine learning algorithms.Methods Data from patients admitted to the department of intensive care unit(ICU)of the Affiliated Hospital of Nanjing University of Chinese Medicine from January 2019 to December 2023 were retrospectively analyzed.The data extracted included①gender,age,history of hypertension,diabetes,coronary heart disease,chronic obstructive pulmonary disease(COPD)and chronic kidney disease(CKD);②Vital signs and results of laboratory examination at admission were also collected,then acute physiology and chronic health evaluationⅡ(APACHEⅡ)score and sequential organ failure assessment(SOFA)score were calculated;③The other laboratory test results not included in APACHEⅡscore and SOFA score,such as blood lactate acid(Lac),alanine aminotransferase(AST),hemoglobin(Hb),procalcitonin(PCT),brian natriuretic peptide(BNP),C-reactive protein(CRP),activated partial thromboplastin time(APTT),D-dimer and troponin I(TNI)were also gathered.According to the 28-day survival,the patients were divided into a survival group and a death group.The difference of the clinical data and related loboratory indicators between the two groups of sepsis patients were compared.LASSO regression and Boruta algorithm were used to screen predictive variables.Models of Logistic regression(LG),neural network(NN)and light gradient boosting machine(LightGBM)were constructed.The data was divided into training set and verification set under a ratio of 7:3,and fivefold cross-validation was used to evaluate the stability of the models.Confusion matrix,receiver operator characteristic curve(ROC curve)and calibration curve were also used to assess the recognition ability and accuracy of three models.Decision curve analysis(DCA)was conducted to evaluate the models'utility in decision-making.Shapley additive explanations(SHAP)analysis was used to explain the best-performing model.Results A total of 426 patients were included in the study,of which 256 survived and 170 died.Compared with death group,the age(72.09±14.08 vs.76.88±11.32,P<0.05),COPD[11.33%(29/256)vs.20.00%(34/170)],CKD[20.31%(52/256)vs.31.77%(54/170)],Lac on admission[mmol/L:1.72(1.20,2.66)vs.2.25(1.60,3.50)],AST[U/L:32.00(18.00,59.75)vs.37.00(24.00,76.50)],CRP[mg/L:71.23(22.51,151.79)vs.87.00(37.00,173.36)],APACHEⅡscore(19.96±6.55 vs.22.83±6.92)and SOFA score[7(5,10)vs.9(5,12)]in surrial group were significantly decreased,the difference were statistically significant(all P<0.05).Age,APACHEⅡscore,Lac,PCT and CRP were revealed as independent predictors of 28-day mortality in sepsis by LASSO regression and Boruta algorithm,the above 5 variables were incorporated into the LG,NN and LightGBM models,and the five-fold cross-validation showed that the LightGBM model had the best stability.The confusion matrix,ROC curve and calibration curves of the 3 models were plotted,and the results showed that the F1 score of the 3 models were 0.61,0.63 and 0.74,respectively;area under the curve(AUC)was 0.68,0.74 and 0.87,respectively;the Log Loss was 0.62,0.41 and 0.34,respectively;and the Brier scores were 0.22,0.13 and 0.09,respectively,indicating that LightGBM model was optimal.DCA showed that LightGBM model had the greatest clinical net benefit.SHAP showed that the predicted results were in good agreement with the actual results.Conclusion The LightGBM model exhibited the best performance in predicting 28-day mortality in septic patients and has the potential to help clinicians identify high-risk patients and guide clinical decision-making.