Construction of prognostic model for intravenous thrombolysis in acute ischemic stroke based on interpretable machine learning
- VernacularTitle:基于可解释机器学习构建急性缺血性脑卒中静脉溶栓预后模型
- Author:
Juan LI
1
;
Dong QI
;
Lei ZHUANG
;
Zheng SI
Author Information
- Keywords: machine learning; shapley additive explanations; acute ischemic stroke; intra-venous thrombolysis; early neurological deterioration; random forest model; large-area cerebral in-farction; National Institutes of Health Stroke Scale
- From: Journal of Clinical Medicine in Practice 2025;29(8):28-34
- CountryChina
- Language:Chinese
- Abstract: Objective To construct machine learning(ML)model for predicting early neurologi-cal deterioration(END)after intravenous thrombolysis(IVT)in patients with acute ischemic stroke(AIS),and to analyze risk factors of END using Shapley additive explanations(SHAP).Methods A total of 97 AIS patients who received IVT were enrolled.Patients were divided into END group(18 cases)and non-END group(79 cases)based on whether they experienced END within 24 hours post-IVT.All patients were randomly divided into training set(n=68)and validation set(n=29)at ra-tio of 7 to 3.Univariate and least absolute shrinkage and selection operator(LASSO)analyses were performed to screen important feature variables associated with END from clinical data.Six ML algo-rithms,including random forest,light gradient boosting machine,decision tree,support vector ma-chine,k-nearest neighbors and extreme gradient boosting,were employed to construct predictive mod-els.Receiver operating characteristic(ROC)curves,calibration curves and clinical decision curve analysis(DC A)were used to evaluate the performance of each ML model.The SHAP method was introduced to interpret the optimal ML model.Results Among the six ML algorithm models,the random forest model was identified as best predictive model.In the training set,it achieved area un-der the curve(AUC)of 0.909,with specificity,precision,recall and F1 score being 0.873,0.856,0.910 and 0.825,respectively.In the validation set,its AUC was 0.915,with correspond-ing values of 0.824,0.800,0.945 and 0.834.Calibration curves and DC A demonstrated that the random forest model had higher prediction accuracy and clinical net benefit.SHAP variable impor-tance plots revealed that the top six contributing imaging factors to END were large-area cerebral in-farction,pre-thrombolysis National Institutes of Health Stroke Scale(NIHSS)score,door-to-needle time(DNT),history of atrial fibrillation,white blood cell(WBC)levels and history of diabetes.Conclusion ML models can effectively predict the risk of END in IVT patients,with the random forest model demonstrating the best predictive performance.Combining SHAP for model visualization interpretation aids clinicians in understanding the contribution of each feature variable to the predic-tion results,thereby facilitating targeted preventive treatment strategies.
