Development of auxiliary early predicting model for human brucellosis using machine learning algorithm.
10.3760/cma.j.cn112150-20221013-00991
- Author:
Wei WANG
1
;
Rui ZHOU
2
;
Chao CHEN
3
;
Xiang FENG
4
;
Wei ZHANG
5
;
Hu Jin LI
1
;
Rong Hua JIN
6
Author Information
1. Department of Blood Transfusion, Beijing Ditan Hospital, Capital Medical University,Beijing 100015,China.
2. Department of Clinical Laboratory, Beijing Chaoyang Hospital Affiliated to Capital Medical University, Beijing 100012, China.
3. Beijing Jinfeng Yitong Technology Co., Ltd, Beijing 100020, China.
4. Inner Mongolia Zhihui Big data Institute, Hohhot 010020, China.
5. Infection Center,Beijing Ditan Hospital, Capital Medical University,Beijing 100015,China.
6. Beijing Ditan Hospital, Capital Medical University,Beijing 100015,China.
- Publication Type:Journal Article
- MeSH:
Male;
Humans;
Retrospective Studies;
Case-Control Studies;
Bayes Theorem;
Algorithms;
Machine Learning
- From:
Chinese Journal of Preventive Medicine
2023;57(10):1601-1607
- CountryChina
- Language:Chinese
-
Abstract:
Using machine learning algorithms to construct an early prediction model of brucellosis to improve the diagnosis efficiency of Brucellosis. This study was a case-control study. 2 381 brucellosis patients from Beijing Ditan Hospital affiliated to Capital Medical University were retrospectively collected as case group, and healthy people from Beijing Chaoyang Hospital affiliated to Capital Medical University were collected as control group from May 9, 2011 to November 29, 2021. The relevant clinical information and full blood count results of 13 257 data were collected and five algorithms of machine learning were used to construct an early predication model of brucellosis by using machine learning: random forest, Naive Bayes, decision tree, logistic regression and support vector machine;14 074 data (2 143 cases incase group and 11 931 cases in control group) were used to establish the early predication model of brucellosis, and 1 564 (238 cases in case group and 1 326 cases in control group) data were used to test the predication efficiency of the brucellosis model. The results showed that the support vector machine algorithm has the best predication performance by comparing the five machine learning models. The area under receiver curve (AUC) of receiver operating characteristic (ROC) was 0.991, and the accuracy, precision, specificity and Recall were 95.6%, 95.5%, 95.4% and 95.9%, respectively. Based on the SHAP plot, platelet distribution width (PDW) and basophil relative value (BASO%) results were low, and men with high coefficient of variation (R-CV), erythrocyte hemoglobin concentration (MCHC), and platelet volume (MPV) were predicted to be at high risk of brucellosis. Platelet distribution width (PDW) contributed the most to the prediction model, followed by red blood cell distribution width coefficient of variation (R-CV). In conclusion, the establishment of a high-precision early predication method of brucellosis based on machine learning may be of great significance for the early detection and treatment of brucellosis patients.