Human-AI collaboration for sepsis early-warning system in emergency triage

Jingyuan XIE; Zhimao LI; Jiandong GAO; Yecheng LIU; Huadong ZHU; Ji WU

Return

Human-AI collaboration for sepsis early-warning system in emergency triage

VernacularTitle:急诊分诊脓毒症预警系统的人机协作研究
Author: Jingyuan XIE ¹ ; Zhimao LI ; Jiandong GAO ; Yecheng LIU ; Huadong ZHU ; Ji WU
Author Information

1. 清华大学电子工程系，北京　100084
Keywords: Sepsis; Machine learning; Emergency service; Human-AI collaboration
From: Chinese Journal of Emergency Medicine 2025;34(5):641-647
CountryChina
Language:Chinese
Abstract: Objective:The research group had previously developed an artificial intelligence algorithm to predict sepsis within 24 hours at the triage stage in emergency departments. This research studied the doctors’ response to algorithm-generated risk alerts and designs appropriate physician-algorithm collaboration strategies to further enhance sepsis risk identification capabilities.Methods:The research collected 40 cases of sepsis in the emergency departments from the open medical database MIMIC-IV (Medical Information Mart for Intensive Care) for a collaboration test. The cases were selected according to their typicality and classified according to the model’s confidence in its prediction. A total of 165 emergency doctors from 58 hospitals in China, stratified by professional rank, participated in the study. Four collaboration modes were designed with different information volumes and reading costs using the information offered by the algorithm. During the test, before and after the model presented its results and interpretive information according to the collaboration mode, the doctors were asked to rate the sepsis risk for each sample and record their confidence.Results:Analysis of the 4 704 valid evaluations done by 147 doctors showed that different collaboration modes caused no significant difference on doctors’ detection of sepsis risk. For cases with high model confidence, physicians’ diagnostic accuracy improved by 2.6%±0.6% ( P=0.02) post-algorithm input, with increased confidence in correct judgments. Conversely, for low-confidence model predictions, diagnostic accuracy decreased by 2.6%±1.4% ( P=0.06), accompanied by reduced clinician confidence in accurate assessments. Conclusions:The collaboration effect is mostly determined by the model’s confidence in its prediction. Different collaboration modes cause no significant difference, and doctors of different titles are influenced consistently with the same model confidence. Suggestions for collaboration design are as follow. When the model has low confidence in its own assessment of a patient’s sepsis risk, it should not directly demonstrate its assessment. When the model has high confidence, its assessment can be offered to the doctors as a reference. When predicting sepsis at the triage stage in the emergency departments, no extra interpretive information is needed.