Machine learning model for prediction of bloodstream infections established based on routine test indexes and its predictive efficiency
10.11816/cn.ni.2025-241705
- VernacularTitle:基于常规检验指标预测血流感染的机器学习模型构建与评价
- Author:
Yan WANG
1
;
Xin HE
;
Yufang LIANG
;
Gaixian WANG
;
Ruifeng BAI
;
Rui ZHOU
Author Information
1. 首都医科大学附属北京积水潭医院医学检验中心,北京 100035
- Publication Type:Journal Article
- Keywords:
Machine learning;
Logistic regression;
Support vector machine;
Random forest;
Bloodstream infec-tion;
Blood routine test;
Prediction model
- From:
Chinese Journal of Nosocomiology
2025;35(10):1542-1548
- CountryChina
- Language:Chinese
-
Abstract:
OBJECTIVE To explore and evaluate the machine learning model for prediction of bacterial bloodstream infections established based on routine test data.METHODS By means of retrospective survey,a total of 5 421 pa-tients who were hospitalized in 3 medical institutions from Jan.2015 to Dec.2022 were recruited as the research subjects,1 914 of whom were assigned as the bloodstream infection group,and 3 507 were assigned as the non-bloodstream infection group.The baseline data including gender and age and the results of routine laboratory tests were collected from the enrolled patients.The 3 types of machine learning algorithms,logistic regression,support vector machine and random forest,were respectively used for the screening of the optimal prediction model;the contribution of feature variables to the predictive capability of the model was interpreted through SHAP.The fea-ture variables of the model were optimized by using recursive feature elimination method,and the predictive effi-ciency of the model was evaluated by the area under the curve(AUC)of receiver operating characteristic(ROC)curves.RESULTS Totally 26 variables involving age,gender and blood routine test indexes were included.The random forest was chosen as the optimal machine learning algorithm for the establishment of prediction model for bloodstream infections,and the accuracy of the model was 0.709,with the AUC 0.706.The result of SHAP ex-planation indicated that the age,hematokrit and erythrocyte volume distribution width-CV had remarkable effect on the model's making right decisions.17 variables of the prediction model showed more remarkable effect than 26 variable on distinguishing from the gram-positive bacteria bloodstream infections from the gram-negative bacteria bloodstream infections,with the AUC 0.715,the sensitivity 0.701,the specificity 0.632.CONCLUSIONS The prediction model that is established based on the blood routine test indexes by machine learning algorithm can pre-dict the bacterial bloodstream infection.Meanwhile,the feature selection strategy can further improve the predic-tive efficiency of the model on basis of lowering the dimensionality.