Development of a 30-day mortality risk prediction model for elderly hemophagocytic lymphohistiocytosis using machine learning based on peripheral blood indicators

Jun ZHOU; Mingjun XIE; Yaman WANG; Huaguo XU

Return

Development of a 30-day mortality risk prediction model for elderly hemophagocytic lymphohistiocytosis using machine learning based on peripheral blood indicators

VernacularTitle:机器学习建立基于外周血指标的老年噬血细胞综合征30天死亡风险模型
Author: Jun ZHOU ¹ ; Mingjun XIE ¹ ; Yaman WANG ¹ ; Huaguo XU ¹
Author Information

1. 南京医科大学第一附属医院检验学部，南京　210029
Publication Type:Journal Article
Keywords: Hemophagocytic lymphohistiocytosis; Machine learning; Urea; Ferritin; Elderly; Prediction model
From: Chinese Journal of Laboratory Medicine 2025;48(12):1521-1527
CountryChina
Language:Chinese
Abstract: Objective:To develop a machine learning prediction model based on peripheral blood indicators for assessing 30-day mortality risk in elderly patients diagnosed with hemophagocytic lymphohistiocytosis (HLH).Methods:A retrospective cohort study was conducted, enrolling elderly patients (age≥65 years) diagnosed HLH at the First Affiliated Hospital of Nanjing Medical University between January 1, 2015, and November 30, 2023. Demographic characteristics, clinical manifestations, and laboratory parameters at admission were collected. The study included 204 elderly HLH patients with a median age of 70 (68-75) years, comprising 134 males (65.69%) and 70 females (34.31%). Using computer-generated random numbers, the data was randomly divided into the training and validation cohorts at a 7∶3 ratio. Based on 30-day survival outcomes, patients in the training cohort were categorized into the death and survivor groups. Predictive variables were screened through univariate analysis and the Boruta algorithm, with prediction models constructed using 11 machine learning algorithms. Model performance was evaluated using the following metrics: area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, specificity, positive predictive value, negative predictive value, F1-score, calibration curve, and decision curve analysis. SHAP analysis was employed for model interpretation.Results:Comparison between the death and survivor groups in the training cohort identified 25 significant indicators ( P<0.05) through univariate analysis. Boruta algorithm-based screening further identified nine predictive variables: urea, ferritin, creatinine (CREA), D-dimer (D-D), platelet (PLT), activated partial thromboplastin time (APTT), aspartate aminotransferase (AST), creatine kinase (CK), and alanine aminotransferase (ALT). Among the 11 algorithms, the top five models by AUC in the training cohort were: XGBoost(AUC=1.000), AdaBoost(AUC=1.000), GBDT(AUC=1.000), DT(AUC=0.967), and RF(AUC=0.945). In the validation cohort, the top five performers by AUC were: RF(AUC=0.812), LR(AUC=0.792), LightGBM(AUC=0.769), AdaBoost(AUC=0.746), and GBDT(AUC=0.742). Thus, the RF model demonstrated optimal performance. SHAP analysis indicated urea as the most significant contributor to prediction outcomes. Conclusion:A machine learning model based on routine laboratory indicators can accurately predict the 30-day mortality risk in elderly HLH patients.