Machine learning model predicts post-stroke depression in patients with ischemic stroke
10.3760/cma.j.issn.1673-4165.2024.11.002
- VernacularTitle:机器学习模型预测缺血性卒中患者的卒中后抑郁
- Author:
Zhuqing WU
1
;
Yueyu ZHANG
;
Chi ZHANG
;
Juncang WU
Author Information
1. 安徽医科大学附属合肥医院,合肥市第二人民医院神经内科,合肥 230011
- Keywords:
Ischemic stroke;
Severity of illness index;
Depression;
Risk factors;
Machine learning;
Algorithms
- From:
International Journal of Cerebrovascular Diseases
2024;32(11):807-813
- CountryChina
- Language:Chinese
-
Abstract:
Objectives:To develop a machine learning prediction model for post-stroke depression (PSD) in patients with acute ischemic stroke (AIS) at 3 months after onset.Methods:Patients with AIS admitted to the Second People's Hospital of Hefei from January 2021 to December 2023 were included retrospectively. According to the 17-item Hamilton Depression Rating Scale (HAMD) evaluation results at 3 months after onset, they were divided into PSD group and non-PSD group. The recursive feature elimination (RFE) method was used to screen the characteristic variables of PSD. A PSD prediction model for patients with AIS was developed based on three machine learning algorithms: logistic regression (LR), random forest (RF), and supported vector machine (SVM). The area under a receiver operating characteristic (ROC) curve (AUC) and calibration curve were used to evaluate the performance of the model. The SHapley Additive exPlanations (SHAP) algorithm was used to analyze the contribution of each risk factor. Results:A total of 243 patients with AIS were included, including 159 males (64.6%), aged 64.32±11.54 years, the median years of schooling was 6 years, and 13 males (5.3%) lived alone. 105 patients (42.7%) had a history of stroke. The median baseline National Institutes of Health Stroke Scale (NIHSS) score was 3, and the median baseline Modified Rankin Scale (mRS) score was 2. 33 patients (13.4%) received intravenous thrombolysis treatment. 93 patients (38.27%) had PSD at 3 months after onset. RFE showed that the optimal number of features was 11, including baseline NIHSS score, baseline mRS score, C-reactive protein, intravenous thrombolysis, low-density lipoprotein cholesterol, small vessel occlusion, D-dimer, total cholesterol, alcohol consumption, right side infarction, and baseline systolic blood pressure. ROC curve analysis shows that the RF model had the best predictive performance (AUC=0.831, 95% confidence interval 0.730-0.931), followed by the SVM model (AUC=0.827, 95% confidence interval 0.713-0.941), and the LR model has the lowest predictive performance (AUC=0.771, 95% confidence interval 0.658-0.885). The calibration curve shows that the RF model fits well with the ideal curve, making it the final advantageous model. SHAP showed that the contribution of baseline NIHSS score, baseline mRS score, low-density lipoprotein cholesterol, total cholesterol, and intravenous thrombolysis ranked among the top 5.Conclusions:The RF model can effectively predict the risk of PSD. The baseline NIHSS score, baseline mRS score, low-density lipoprotein cholesterol, and total cholesterol, as well as intravenous thrombolysis are the key predictive factors.