Establishment and validation of prediction model for cirrhosis-related hepatic encephalopathy by machine learning algorithm
10.3760/cma.j.cn114452-20240628-00338
- VernacularTitle:利用机器学习算法建立并验证肝硬化相关肝性脑病预测模型
- Author:
Shuting FU
1
;
Bing HE
1
;
Jiancheng XU
1
Author Information
1. 吉林大学第一医院检验科,长春 130021
- Publication Type:Journal Article
- Keywords:
Artificial intelligence;
Liver cirrhosis;
Hepatic encephalopathy;
Hemoglobins
- From:
Chinese Journal of Laboratory Medicine
2025;48(1):93-102
- CountryChina
- Language:Chinese
-
Abstract:
Objective:A predictive model for cirrhosis-associated hepatic encephalopathy (HE) was constructed and validated using a machine learning algorithm to evaluate the predictive efficacy of the model.Methods:Clinical data of patients with liver cirrhosis (4 537 cases) in the medical record system and laboratory information system of the First Hospital of Jilin University from January 2018 to December 2019 were collected and analyzed retrospectively. Based on the inclusion and exclusion criteria, 474 patients were finally included in the study. Cohort 1 included patients with cirrhosis without HE (113 cases) and patients with cirrhosis complicated with HE (108 cases) from January to December 2018, and was used for feature screening, model building, optimal algorithm selection, and internal validation of the cirrhosis complicated with HE risk prediction model. Cohort 2 included patients with cirrhosis without HE (133 patients) and patients with cirrhosis complicated with HE (120 patients) from January 2019 to December 2019 for external validation. Lasso regression was utilized to identify key predictive variables, and various models such as extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), random forest (RF), and support vector machine (SVM) were employed for model building and internal validation. The DeLong test was used to compare the predictive efficacy of the four models for HE, and the optimal algorithm was selected by combining the specificity or sensitivity. The area under the ROC curve, calibration curve and decision curve were applied to evaluate the predictive efficacy, accuracy of predicted probabilities and clinical utility of the model.Results:The 46 tests with<30% missing data in Cohort 1 were extracted as variables to be selected for modeling. Seven characteristic variables were obtained using Lasso regression screening, including hemoglobin (Hb), total bile acid (TBA), cholinesterase, total bilirubin, creatinine, prothrombin activity, and circulating platelets. The prediction model built by the LightGBM algorithm (HE-Lab7 model) predicted HE with an area under the curve (AUC) of 0.880, which was higher than that of XGBoost, RF, and SVM (all P<0.05), with a sensitivity of 0.825 and a specificity of 0.836. The Brier score of the calibration curve was 0.147, indicating that the predicted probability of the model is in good agreement with the actual probability of occurrence. Decision curves indicate that the model has a high clinical benefit. In Cohort 2, the HE-Lab7 model predicted HE with an AUC of 0.775, a sensitivity of 0.927, and a specificity of 0.758. Conclusion:The predictive efficacy of the cirrhosis-associated HE risk prediction model developed based on the optimal LightGBM algorithm using the large-scale test data based on four machine learning algorithms is good, which provides a reference basis for early prediction and identification of cirrhosis-associated HE.