Construct a machine learning model for early prediction of sepsis-induced respiratory tract infection
10.3969/j.issn.1673-9701.2025.24.013
- VernacularTitle:基于机器学习的肺源性脓毒症早期风险预测模型
- Author:
Lei ZHANG
1
;
Mingkuan SU
1
;
Haiying WU
1
;
Hongbin CHEN
1
;
Jiancheng HUANG
1
Author Information
1. 宁德市闽东医院医学检验科,福建福安 355000
- Publication Type:Journal Article
- Keywords:
Machine learning;
Sepsis;
Respiratory tract infection;
Procalcitonin
- From:
China Modern Doctor
2025;63(24):63-67
- CountryChina
- Language:Chinese
-
Abstract:
Objective To construct a machine learning algorithm using biomarkers to predict the risk of sepsis-induced respiratory tract infection in order to assist clinicians in making decisions.Methods Based on the diagnostic criteria of the research subjects,and the basic clinical data of the participants were collected.The data set was randomly split into a training set(80%)and a validation set(20%).Use feature filtering algorithms to select the best subset of variables from the training set,and use this subset to construct random forest(RF),extreme gradient boosting(XGBoost),adaptive boosting(AdaBoost),Logistic regression(LR),ridge regression(Ridge),and support vector machine(SVM)classifiers.Then,evaluate the model's generalization ability using a validation dataset.Evaluate the performance of the model comprehensively through accuracy,precision,recall,and area under the curve.Results A total of 377 patients with sepsis-induced respiratory tract infection(case group)and 564 patients with respiratory tract infection(control group)were included,and 17 variables were found to be suitable for the initial model construction.Using feature screening algorithm,we found that the predictive performance of tree models(RF,XGboost,and AdaBoost)was better than that of linear models(LR,SVM,and Ridge).The AdaBoost model included 14 biomarkers,and its prediction accuracy was better than RF,XGBoost,LR,SVM,Ridge models,its precision,recall,accuracy and area under the curve were 0.90,0.84,91.75%and 0.950,respectively.The Ridge model had the worst prediction performance,with an accuracy of 82.97%,its precision,recall and area under the curve were 0.90,0.72 and 0.835 respectively.Conclusion In this study,six predictive models of sepsis-induced respiratory tract infection were developed,among which AdaBoost model could more accurately predict the risk of sepsis-induced respiratory tract infection and help to assist clinical decision-making.