Development and validation of an XGBoost-based prediction model for acute liver injury in statin users
10.12173/j.issn.1005-0698.202502023
- VernacularTitle:基于XGBoost的他汀类药物用药者急性肝损伤预测模型的开发与验证
- Author:
Xianglong MENG
1
;
Yuelin YU
;
Yexiang SUN
;
Peng SHEN
;
Zhiqin JIANG
;
Yu ZHU
;
Yueqi YIN
;
Siyan ZHAN
;
Shengfeng WANG
Author Information
1. 北京大学公共卫生学院流行病与卫生统计学系(北京 100191);教育部重大疾病流行病学重点实验室(北京 100191)
- Publication Type:Journal Article
- Keywords:
Statins;
Acute liver injury;
Prediction model;
XGBoost algorithm;
Cost-sensitive learning
- From:
Chinese Journal of Pharmacoepidemiology
2025;34(8):867-876
- CountryChina
- Language:Chinese
-
Abstract:
Objective To develop and validate a prediction model to identify high-risk individuals who are at-risk to develop acute liver injury(ALI)within 180 days in new statin users,and to support early clinical intervention.Methods Data were sourced from the Yinzhou Regional Health Information Platform,covering statin initiators aged 18 years and older from January 1,2010,to October 31,2021.The dataset was divided into a derivation cohort and a temporal validation cohort based on the time of statin initiation.Predictors were selected using LASSO regression,and the model was constructed using the extreme gradient boosting(XGBoost)algorithm combined with cost-sensitive learning.Model performance was evaluated using Brier scores,Harrell's C-index,and calibration curves.Results A total of 126,440 statin initiators were included,with 90,542 in the derivation cohort and 35,898 in the validation cohort.Within 180 days of initial statin use,412(0.33%)patients developed ALI,including 305(0.34%)in the derivation cohort and 107(0.30%)in the validation cohort.The final model incorporated 16 predictors,which included demographic characteristics,lifestyle factors,family history,medical history,statin use,and concomitant medication use.The model demonstrated excellent overall performance[Brier score=0.0043,95%CI(0.0038,0.0049)],discrimination[Harrell's C-index=0.761,95%CI(0.725,0.794)],and calibration in internal validation.In temporal validation,the model also performed well[Brier score=0.0044,95%CI(0.0036,0.0052),Harrell's C-index=0.703,95%CI(0.614,0.781)].Conclusion This study develope and validate a prediction model for ALI in statin users,providing clinicians with a reliable tool for individualized risk assessment.This model can help achieve risk stratification and reduce the occurrence of ALI.