Search Results

1.Exploration of basket trial design with Bayesian method and its application value in traditional Chinese medicine.

Si-Cun WANG ; Mu-Zhi LI ; Hai-Xia DANG ; Hao GU ; Jun LIU ; Zhong WANG ; Ya-Nan YU

China Journal of Chinese Materia Medica 2025;50(3):846-852

Basket trial, as an innovative clinical trial design concept, marks the transformation of medical research from the traditional large-scale and single-disease treatment to the precise and individualized treatment. By gradually incorporating the Bayesian method during development, the trial design becomes more scientific and reasonable and increases its efficiency. The fundamental principle of the Bayesian method is the utilization of prior knowledge in conjunction with new observational data to dynamically update the posterior probability. This flexibility enhances the basket trial's capacity to effectively adapt to variations during the research process. Consequently, it enables researchers to dynamically adjust research strategies based on accumulated data and improve the predictive accuracy regarding treatment responses. In addition, the design concept of the basket trial aligns with the traditional Chinese medicine(TCM) principle of "homotherapy for heteropathy". The principle of "homotherapy for heteropathy" emphasizes that under certain conditions, different diseases may have the same treatment. Similarly, basket trials allow using a uniform trial design across multiple diseases, offering enhanced operational and significant practical value in the realm of TCM, particularly within the context of syndrome-based disease research. By introducing basket trials, the design of TCM clinical studies will be more scientific and yield higher-quality evidence. This study systematically categorized various Bayesian methods and models utilized in basket trials, evaluated their strengths and weaknesses, and identified their appropriate application contexts, so as to offer a practical guide for designing basket trials in the realm of TCM.
Bayes Theorem ; Humans ; Medicine, Chinese Traditional/methods* ; Research Design ; Clinical Trials as Topic/methods* ; Drugs, Chinese Herbal/therapeutic use*

2.Construction and external validation of a machine learning-based prediction model for epilepsy one year after acute stroke.

Wenkao ZHOU ; Fangli ZHAO ; Xingqiang QIU ; Yujuan YANG ; Tingting WANG ; Lingyan HUANG

Chinese Critical Care Medicine 2025;37(5):445-451

OBJECTIVE: To identify the optimal machine learning algorithm for predicting post-stroke epilepsy (PSE) within one year following acute stroke, establish a nomogram model based on this algorithm, and perform external validation to achieve accurate prediction of secondary epilepsy. METHODS: A total of 870 acute stroke patients admitted to the emergency department of Xiang'an Hospital of Xiamen University from June 2019 to June 2023 were enrolled for model development (model group). An external validation cohort of 435 acute stroke patients admitted to the Fifth Hospital of Xiamen during the same period was used to validate the machine learning algorithms and nomogram model. Patients were classified into control and epilepsy groups based on the development of PSE within one year. Clinical and laboratory data, including baseline characteristics, stroke location, vascular status, complications, hematologic parameters, and National Institutes of Health Stroke Scale (NIHSS) score, were collected for analysis. Nine machine learning algorithms such as logistic regression, CN2 rule induction, K-nearest neighbors, adaptive boosting, random forest, gradient boosting, support vector machine, naive Bayes, and neural network were applied to evaluate predictive performance. The area under the curve (AUC) of receiver operator characteristic curve (ROC curve) was used to identify the optimal algorithm. Logistic regression was used to screen risk factors for PSE, and the top 10 predictors were selected to construct the nomogram model. The predictive performance of the model was evaluated using the ROC curve in both the model and validation groups. RESULTS: Among the 870 patients in the model group, 29 developed PSE within one year. Among the nine algorithms tested, logistic regression demonstrated the best performance and generalizability, with an AUC of 0.923. Univariate logistic regression identified several risk factors for PSE, including platelet count, white blood cell count, red blood cell count, glycated hemoglobin (HbA1c), C-reactive protein (CRP), triglycerides, high-density lipoprotein (HDL), aspartate aminotransferase (AST), alanine aminotransferase (ALT), activated partial thromboplastin time (APTT), thrombin time, D-dimer, fibrinogen, creatine kinase (CK), creatine kinase-MB (CK-MB), lactate dehydrogenase (LDH), serum sodium, lactic acid, anion gap, NIHSS score, brain herniation, periventricular stroke, and carotid artery plaque. Further multivariate logistic regression analysis showed that white blood cell count, HDL, fibrinogen, lactic acid and brain herniation were independent risk factors [odds ratio (OR) were 1.837, 198.039, 47.025, 11.559, 70.722, respectively, all P < 0.05]. In the external validation group, univariate logistic regression analysis showed that platelet count, white blood cell count, CRP, triacylglycerol, APTT, D-dimer, fibrinogen, CK, CK-MB, LDH, NIHSS score, and cerebral herniation were risk factors for PSE one year after acute stroke. Further multiple logistic regression analysis showed that APTT and cerebral herniation were independent predictors (OR were 0.587 and 116.193, respectively, both P < 0.05). The nomogram model, constructed using 10 key variables-brain herniation, periventricular stroke, carotid artery plaque, white blood cell count, triglycerides, thrombin time, D-dimer, serum sodium, lactic acid, and NIHSS score-achieved an AUC of 0.908 in the model group and 0.864 in the external validation group. CONCLUSIONS The logistic regression-based prediction model for epilepsy one year after acute stroke, developed using machine learning algorithms, showed optimal predictive performance. The nomogram model based on the logistic regression-derived predictors showed strong discriminative power and was successfully validated externally, suggesting favorable clinical applicability and generalizability.
Humans ; Machine Learning ; Stroke/complications* ; Nomograms ; Epilepsy/etiology* ; Algorithms ; Male ; Female ; Logistic Models ; Middle Aged ; Aged ; Risk Factors ; Bayes Theorem

3.Establishment and evaluation of a machine learning prediction model for sepsis-related encephalopathy in the elderly.

Xiao YUE ; Yiwen WANG ; Zhifang LI ; Lei WANG ; Li HUANG ; Shuo WANG ; Yiming HOU ; Shu ZHANG ; Zhengbin WANG

Chinese Critical Care Medicine 2025;37(10):937-943

OBJECTIVE: To construct machine learning prediction model for sepsis-associated encephalopathy (SAE), and analyze the application value of the model on early identification of SAE risk in elderly septic patients. METHODS: Patients aged over 60 years with a primary diagnosis of sepsis admitted to intensive care unit (ICU) from 2008 to 2023 were selected from Medical Information Mart for Intensive Care-IV 2.2 (MIMIC-IV 2.2). Demographic variables, disease severity scores, comorbidities, interventions, laboratory indicators, and hospitalization details were collected. Key factors associated with SAE were identified using univariate Logistic regression analysis. The data were randomly divided into training and validation sets in a 7 : 3 ratio. Multivariable Logistic regression analysis was conducted in the training set and visualized using a nomogram model for prediction of SAE. The discrimination of the model was evaluated in the validation set using the receiver operator characteristic curve (ROC curve), and its calibration was assessed using calibration curve. Furthermore, multiple machine learning algorithms, including multi-layer perceptron (MLP), support vector machine (SVM), naive bayes (NB), gradient boosting machine (GBM), random forest (RF), and extreme gradient boosting (XGB), were constructed in the training set. Their predictive performance was subsequently evaluated on the validation set. Taking the XGB model as an example, the interpretability of the model through the SHapley Additive exPlanations (SHAP) algorithm was enhanced to identify the key predictive factors and their contributions. RESULTS: A total of 2 204 septic patients were finally enrolled, of whom 840 developed SAE (38.1%). A total of 21 variables associated with SAE were screened through univariate Logistic regression analysis. Multivariable Logistic regression analysis showed that endotracheal intubation [odds ratio (OR) = 0.40, 95% confidence interval (95%CI) was 0.19-0.88, P < 0.001], oxygen therapy (OR = 0.76, 95%CI was 0.53-0.95, P = 0.023), tracheotomy (OR = 0.20, 95%CI was 0.07-0.53, P < 0.001), continuous renal replacement therapy (CRRT; OR = 0.32, 95%CI was 0.15-0.70, P < 0.001), cerebrovascular disease (OR = 0.31, 95%CI was 0.16-0.60, P < 0.001), rheumatic disease (OR = 0.44, 95%CI was 0.19-0.99, P < 0.001), male (OR = 0.68, 95%CI was 0.54-0.86, P = 0.001), and maximum anion gap (AG; OR = 0.95, 95%CI was 0.93-0.97, P < 0.001) were associated with an decreased probability of SAE, and age (OR = 1.05, 95%CI was 1.03-1.06, P < 0.001), acute physiology score III (APSIII; OR = 1.02, 95%CI was 1.01-1.02, P < 0.001), Oxford acute severity of illness score (OASIS; OR = 1.04, 95%CI was 1.03-1.06, P < 0.001), and length of hospital stay (OR = 1.01, 95%CI was 1.01-1.02, P < 0.001) were associated with an increased probability of SAE. A nomogram model was constructed based on these variables. In the validation set, ROC curve analysis showed that the model achieved an area under the ROC curve (AUC) of 0.723, and the calibration curve showed good consistency between the predicted probability of the model and the observed probability. Among the machine learning algorithms, including MLP, SVM, NB, GBM, RF, and XGB, the SVM model and RF model demonstrated relatively good predictive performance, with AUC of 0.748 and 0.739, respectively, and the sensitivity was both exceeding 85%. The predictive performance of the XGB model was explained through SHAP analysis, and the results indicated that APSIII score (SHAP value was 0.871), age (SHAP value was 0.521), and OASIS score (SHAP value was 0.443) were important factors affecting the predictive performance of the model. CONCLUSIONS The machine learning-based SAE prediction model exhibits good predictive capability and holds significant application value for the early identification of SAE risk in elderly septic patients.
Humans ; Machine Learning ; Aged ; Sepsis-Associated Encephalopathy ; Sepsis/complications* ; Intensive Care Units ; Logistic Models ; Middle Aged ; Male ; ROC Curve ; Female ; Bayes Theorem ; Nomograms ; Support Vector Machine ; Algorithms

4.An adaptive Bayesian randomized controlled trial of traditional Chinese medicine in progressive pulmonary fibrosis: Rationale and study design.

Cheng ZHANG ; Yi-Sen NIE ; Chuan-Tao ZHANG ; Hong-Jing YANG ; Hao-Ran ZHANG ; Wei XIAO ; Guang-Fu CUI ; Jia LI ; Shuang-Jing LI ; Qing-Song HUANG ; Shi-Yan YAN

Journal of Integrative Medicine 2025;23(2):138-144

Progressive pulmonary fibrosis (PPF) is a progressive and lethal condition with few effective treatment options. Improvements in quality of life for patients with PPF remain limited even while receiving treatment with approved antifibrotic drugs. Traditional Chinese medicine (TCM) has the potential to improve cough, dyspnea and fatigue symptoms of patients with PPF. TCM treatments are typically diverse and individualized, requiring urgent development of efficient and precise design strategies to identify effective treatment options. We designed an innovative Bayesian adaptive two-stage trial, hoping to provide new ideas for the rapid evaluation of the effectiveness of TCM in PPF. An open-label, two-stage, adaptive Bayesian randomized controlled trial will be conducted in China. Based on Bayesian methods, the trial will employ response-adaptive randomization to allocate patients to study groups based on data collected over the course of the trial. The adaptive Bayesian trial design will employ a Bayesian hierarchical model with "stopping" and "continuation" criteria once a predetermined posterior probability of superiority or futility and a decision threshold are reached. The trial can be implemented more efficiently by sharing the master protocol and organizational management mechanisms of the sub-trial we have implemented. The primary patient-reported outcome is a change in the Leicester Cough Questionnaire score, reflecting an improvement in cough-specific quality of life. The adaptive Bayesian trial design may be a promising method to facilitate the rapid clinical evaluation of TCM effectiveness for PPF, and will provide an example for how to evaluate TCM effectiveness in rare and refractory diseases. However, due to the complexity of the trial implementation, sufficient simulation analysis by professional statistical analysts is required to construct a Bayesian response-adaptive randomization procedure for timely response. Moreover, detailed standard operating procedures need to be developed to ensure the feasibility of the trial implementation. Please cite this article as: Zhang C, Nie YS, Zhang CT, Yang HJ, Zhang HR, Xiao W, Cui GF, Li J, Li SJ, Huang QS, Yan SY. An adaptive Bayesian randomized controlled trial of traditional Chinese medicine in progressive pulmonary fibrosis: Rationale and study design. J Integr Med. 2025; 23(2): 138-145.
Female ; Humans ; Male ; Bayes Theorem ; Disease Progression ; Drugs, Chinese Herbal/therapeutic use* ; Medicine, Chinese Traditional/methods* ; Pulmonary Fibrosis/therapy* ; Quality of Life ; Randomized Controlled Trials as Topic ; Research Design ; Adaptive Clinical Trials as Topic

5.Identifying High-Risk Areas for Type 2 Diabetes Mellitus Mortality in Guangdong, China: Spatiotemporal Clustering and Socioenvironmental Determinants.

Hai Ming LUO ; Wen Biao HU ; Yan Jun XU ; Xue Yan ZHENG ; Qun HE ; Lu LYU ; Rui Lin MENG ; Xiao Jun XU ; Fei ZOU

Biomedical and Environmental Sciences 2025;38(5):585-597

OBJECTIVE: This study aimed to identify high-risk areas for type 2 diabetes mellitus (T2DM) mortality to provide relevant evidence for interventions in emerging economies. METHODS: Empirical Bayesian Kriging and a discrete Poisson space-time scan statistic were applied to identify the spatiotemporal clusters of T2DM mortality. The relationships between economic factors, air pollutants, and the mortality risk of T2DM were assessed using regression analysis and the Poisson Log-linear Model. RESULTS: A coastal district in East Guangdong, China, had the highest risk (Relative Risk [RR] = 4.58, P < 0.01), followed by the 10 coastal districts/counties in West Guangdong, China (RR = 2.88, P < 0.01). The coastal county in the Pearl River Delta, China (RR = 2.24, P < 0.01), had the third-highest risk. The remaining risk areas were two coastal counties in East Guangdong, 16 districts/counties in the Pearl River Delta, and two counties in North Guangdong, China. Mortality due to T2DM was associated with gross domestic product per capita (GDP per capita). In pilot assessments, T2DM mortality was significantly associated with carbon monoxide. CONCLUSION High mortality from T2DM occurred in the coastal areas of East and West Guangdong, especially where the economy was progressing towards the upper middle-income level.
Diabetes Mellitus, Type 2/epidemiology* ; China/epidemiology* ; Humans ; Risk Factors ; Spatio-Temporal Analysis ; Air Pollutants/analysis* ; Socioeconomic Factors ; Bayes Theorem ; Female ; Male ; Middle Aged

6.Spatio-Temporal Pattern and Socio-economic Influencing Factors of Tuberculosis Incidence in Guangdong Province: A Bayesian Spatiotemporal Analysis.

Hui Zhong WU ; Xing LI ; Jia Wen WANG ; Rong Hua JIAN ; Jian Xiong HU ; Yi Jun HU ; Yi Ting XU ; Jianpeng XIAO ; Ai Qiong JIN ; Liang CHEN

Biomedical and Environmental Sciences 2025;38(7):819-828

OBJECTIVE: To investigate the spatiotemporal patterns and socioeconomic factors influencing the incidence of tuberculosis (TB) in the Guangdong Province between 2010 and 2019. METHOD: Spatial and temporal variations in TB incidence were mapped using heat maps and hierarchical clustering. Socioenvironmental influencing factors were evaluated using a Bayesian spatiotemporal conditional autoregressive (ST-CAR) model. RESULTS: Annual incidence of TB in Guangdong decreased from 91.85/100,000 in 2010 to 53.06/100,000 in 2019. Spatial hotspots were found in northeastern Guangdong, particularly in Heyuan, Shanwei, and Shantou, while Shenzhen, Dongguan, and Foshan had the lowest rates in the Pearl River Delta. The ST-CAR model showed that the TB risk was lower with higher per capita Gross Domestic Product (GDP) [Relative Risk ( RR), 0.91; 95% Confidence Interval ( CI): 0.86-0.98], more the ratio of licensed physicians and physician ( RR, 0.94; 95% CI: 0.90-0.98), and higher per capita public expenditure ( RR, 0.94; 95% CI: 0.90-0.97), with a marginal effect of population density ( RR, 0.86; 95% CI: 0.86-1.00). CONCLUSION The incidence of TB in Guangdong varies spatially and temporally. Areas with poor economic conditions and insufficient healthcare resources are at an increased risk of TB infection. Strategies focusing on equitable health resource distribution and economic development are the key to TB control.
Humans ; China/epidemiology* ; Incidence ; Bayes Theorem ; Spatio-Temporal Analysis ; Tuberculosis/epidemiology* ; Socioeconomic Factors

7.Disease burden and trend of melanoma among middle-aged and elderly population in China from 1990 to 2020, and prediction for 2022 to 2035.

Lyuxin GUAN ; Ziqin GAN ; Guangtao HUANG ; Suchun HOU ; Yansi LYU

Journal of Zhejiang University. Medical sciences 2025;54(1):1-9

OBJECTIVES: To analyze the disease burden of melanoma among middle-aged and elderly populations in China, and to predict the future trend. METHODS: Data from the Global Burden of Disease (GBD) 2021 were utilized to collect incidence and mortality rates of melanoma, disability-adjusted life years (DALYs), and corresponding age crude rates among the middle-aged and elderly population in China during 1990 and 2021. Additionally, the estimated annual percentage change (EAPC) was employed to assess the temporal trends. Age-period-cohort (APC) and Bayesian age-period-cohort (BAPC) models were utilized to compute age, period, and cohort effects on incidence and mortality rates of melanoma, as well as to predict future trends up to 2035. RESULTS: During 1990-2021, the incidence rate of melanoma for males was higher than that for females among the middle-aged and elderly population in China, and the overall incidence rate increased annually with an EAPC of 2.13 (1.90-2.36), while the overall mortality rate and DALY rate showed a declining trend with an EAPC of －0.28 (－0.41-－0.15) and －0.54 (－0.68-－0.41), respectively. The results of the APC model analysis revealed that age effects on both incidence and mortality rates of melanoma in China's middle-aged and elderly population were significant, with both increasing with age. Period and cohort effects showed an upward trend for incidence rates but a downward trend for mortality rates. Moreover, the period and cohort effects for mortality rates were not significant among females. In the BAPC prediction model, the number of incidences of melanoma in middle-aged and elderly people in China would increase dramatically. By 2035, the number of incidence cases is expected to reach approximately 9600 (males) and 10 300 (females), corresponding to an incidence rate of 2.66/10⁵ and 2.67/10⁵, respectively. The number of deaths is projected to be about 2600 (males) and 3500 (females) by 2035, corresponding to a mortality rate of 0.72/10⁵ and 0.91/10⁵, respectively. CONCLUSIONS The disease burden of melanoma among the middle-aged and elderly population in China remains substantial and is expected to increase over the next decade.
Humans ; Melanoma/mortality* ; China/epidemiology* ; Aged ; Middle Aged ; Male ; Female ; Incidence ; Disability-Adjusted Life Years ; Bayes Theorem ; Cost of Illness ; Skin Neoplasms/epidemiology*

8.Failure Diagnosis Analysis of Medical Equipment Based on Fault Tree and Fuzzy Bayesian Network.

Ke ZHANG ; Liang HUANG

Chinese Journal of Medical Instrumentation 2025;49(5):540-544

OBJECTIVE: To enhance the reliability of medical equipment, this study aims to develop a failure cause diagnosis model and provide rational suggestions for efficient equipment use. METHODS: Combine fault tree analysis (FTA) to identify basic events causing equipment failure and calculate their prior probabilities. Obtain conditional probability tables for each node through expert assessment. Integrate triangular fuzzy number theory with Bayesian network (BN) to construct a fuzzy Bayesian network (FBN) for posterior probability inference and sensitivity analysis. RESULTS: Using endoscopes as the subject, the analysis shows that the model accurately calculates the endoscope failure probability at 0.385%, and identifies the key causes: improper cleaning ( X5, posterior probability 0.36064), untimely fault detection ( X8, posterior probability 0.23571), irregular transportation ( X6, posterior probability 0.11344), and natural aging ( X10, posterior probability 0.11377). Sensitivity analysis also confirms their influence weights (mutual information values are 0.00749, 0.00591, 0.00202, 0.00174). CONCLUSION The model can accurately perform quantitative analysis and rapid fault location of medical equipment failures, enabling effective preventive measures.
Bayes Theorem ; Fuzzy Logic ; Equipment Failure Analysis/methods* ; Equipment Failure ; Algorithms

9.Analysis and projection of the disease burden of nasopharyngeal carcinoma in China based on the GBD database.

Yexun SONG ; Xiajing LIU ; Yongquan ZHANG ; Heqing LI

Journal of Central South University(Medical Sciences) 2025;50(4):675-683

OBJECTIVES: Nasopharyngeal carcinoma is often diagnosed at a late stage due to its concealed location and exhibits marked regional clustering, posing a significant public health challenge in China. This study aims to analyze the disease burden of nasopharyngeal carcinoma in China using the latest 2021 Global Burden of Diseases (GBD) database, providing epidemiological evidence for precise prevention and control of nasopharyngeal carcinoma. METHODS: Age-standardized incidence rate (ASIR), mortality rate, and disability-adjusted life year (DALY) rate were used as indicators of disease burden. Stratified analyses were conducted by age, sex, socio-demographic index (SDI), and relevant risk factors. The autoregressive integrated moving average (ARIMA) model and Bayesian age-period-cohort (BAPC) model were employed to project ASIR trends through 2050. RESULTS: In 2021, China's age-standardized incidence, mortality, and DALY rates of nasopharyngeal carcinoma were 3.4/100 000, 1.5/100 000, and 48.7/100 000, respectively, all higher than the global average. Across all age groups, Chinese males exhibited higher ASIR, mortality, and DALY rates than females. From 1990 to 2021, the disease burden of nasopharyngeal carcinoma in China decreased gradually with rising SDI. The proportion of nasopharyngeal carcinoma burden attributed to alcohol consumption, smoking, and occupational formaldehyde exposure in China exceeded global levels, especially among males. Projections from both models indicate a rising trend in ASIR for males, females, and the general population in China and globally from 2022 to 2050. CONCLUSIONS Over the past 30 years, the disease burden of nasopharyngeal carcinoma in China has decreased with the increasing SDI values but remains higher than the global average. Furthermore, ASIR is projected to increase over the next 30 years. It is imperative for China to enhance healthcare resource allocation for nasopharyngeal carcinoma prevention, diagnosis, and treatment, particularly among high-risk male populations.
Humans ; China/epidemiology* ; Male ; Nasopharyngeal Carcinoma/mortality* ; Female ; Middle Aged ; Nasopharyngeal Neoplasms/mortality* ; Adult ; Incidence ; Global Burden of Disease ; Disability-Adjusted Life Years ; Aged ; Risk Factors ; Adolescent ; Databases, Factual ; Young Adult ; Cost of Illness ; Child ; Bayes Theorem

10.Nomogram and machine learning models for predicting in-hospital mortality in sepsis patients with deep vein thrombosis.

Hongwei DUAN ; Huaizheng LIU ; Chuanzheng SUN ; Jing QI

Journal of Central South University(Medical Sciences) 2025;50(6):1013-1029

OBJECTIVES: Global epidemiological data indicate that 20% to 30% of intensive care unit (ICU) sepsis patients progress to deep vein thrombosis (DVT) due to coagulopathy, with an associated mortality rate of 25% to 40%. Existing prognostic tools have limitations. This study aims to develop and validate nomogram and machine learning models to predict in-hospital mortality in sepsis patients with DVT and assess their clinical applicability. METHODS: This multicenter retrospective study drew on data from the Medical Information Mart for Intensive Care IV (MIMIC-IV; n=2 235), the eICU Collaborative Research Database (eICU-CRD; n=1 274), and the Patient Admission Dataset from the ICU of Third Xiangya Hospital, Central South University (CSU-XYS-ICU; n=107). MIMIC-IV was split into a training set (n=1 584) and internal validation set (n=651), with the remaining datasets used for external validation. Predictors were selected via least absolute shrinkage and selection operator (LASSO) regression and Bayesian Information Criterion (BIC), and a nomogram model was constructed. An extreme gradient boosting (XGBoost) algorithm was used to build the machine learning model. Model performance was assessed by the concordance index (C-index), calibration curves, Brier score, decision curve analysis (DCA), and net reclassification improvement index (NRI). RESULTS: Five key predictors, age [odds ratio (OR)=1.02, 95% CI 1.01 to 1.03, P<0.001], minimum activated partial thromboplastin (APTT; OR=1.09, 95% CI 1.08 to 1.11, P<0.001), maximum APTT (OR=1.01, 95% CI 1.00 to 1.01, P<0.001), maximum lactate (OR=1.56, 95% CI 1.39 to 1.75, P<0.001), and maximum serum creatinine (OR=2.03, 95% CI 1.79 to 2.30, P<0.001), were included in the nomogram. The model showed robust performance in internal validation (C-index=0.845, 95% CI 0.811 to 0.879) and external validation (eICU-CRD: C-index=0.827, 95% CI 0.800 to 0.854; CSU-XYS-ICU: C-index=0.779, 95% CI 0.687 to 0.871). Calibration curves indicated good agreement between predicted and observed outcomes (Brier score<0.25), and DCA confirmed clinical benefit. The XGBoost model achieved an area under the receiver operating characteristic curve (AUC) of 0.982 (95% CI 0.969 to 0.985) in the training set, but performance declined in external validation (eICU-CRD, AUC=0.825, 95% CI 0.817 to 0.861; CSU-XYS-ICU, AUC=0.766, 95% CI 0.700 to 0.873), though it remained above clinical thresholds. Net reclassification improvement was slightly lower for XGBoost compared with the nomogram (NRI=0.58). CONCLUSIONS Both the nomogram and XGBoost models effectively predict in-hospital mortality in sepsis patients with DVT. However, the nomogram offers superior generalizability and clinical usability. Its visual scoring system provides a quantitative tool for identifying high-risk patients and implementing individualized interventions.
Humans ; Sepsis/complications* ; Machine Learning ; Nomograms ; Venous Thrombosis/complications* ; Retrospective Studies ; Hospital Mortality ; Male ; Female ; Middle Aged ; Aged ; Intensive Care Units ; Prognosis ; Bayes Theorem