Construction and validation of a prediction model for swallowing disorder in elderly stroke patients based on explainable machine learning
10.3969/j.issn.1009-0126.2025.06.003
- VernacularTitle:基于可解释机器学习方法老年脑卒中患者吞咽障碍预测模型的构建与验证
- Author:
Yunhan LIU
1
;
Mingming JIANG
;
Dongmei LI
;
Yu DING
;
Hengge XIE
;
Kunlun HE
;
Wuhong ZHOU
;
Yanshuang CHENG
Author Information
1. 100853 北京,解放军总医院第二医学中心神经内科
- Publication Type:Journal Article
- Keywords:
stroke;
deglutition disorders;
forecasting;
logistic models;
algorithms;
machine learning
- From:
Chinese Journal of Geriatric Heart Brain and Vessel Diseases
2025;27(6):698-704
- CountryChina
- Language:Chinese
-
Abstract:
Objective To construct a risk prediction model for post-stroke dysphagia(PSD)based on clinical and laboratory indicators of elderly stroke patients with explainable machine learning.Methods A retrospective analysis was conducted on 3994 stroke patients hospitalized in Depart-ment of Neurology of First Medical Center of Chinese PLA General Hospital from October 2010 to December 2021.Among them,the 1390 cases admitted during January 2019 and December 2021 were assigned into an external validation set,and the 2604 cases admitted during October 2010 to January 2019 were into a training group.Those from the training group were further divided into a training set(1823 cases)and an internal validation set(781 cases)in a 7∶3 ratio,and also grouped into a PSD subgroup(773 cases)and a non-PSD group(1831 cases).With occurrence of swallowing difficulties as an endpoint,risk prediction models were constructed using random for-est(RF),eXtreme Gradient Boosting(XGBoost),Support Vector Machine(SVM),and logistic regression.ROC curve analysis was employed to evaluate the performance of our models.After the optimal model was selected,SHAP was employed to interpret feature contributions.Results There were significant differences in muscle strength,right/left-sided stroke,and area of brain in-jury between the PSD and the non-PSD groups(P<0.01).The PSD group had obviously larger proportions of hypertension,diabetes,and drinking history,increased neutrophil counts,and de-creased levels of potassium and albumin when compared with the non-PSD group(P<0.05,P<0.01).Multivariate logistic regression analysis showed that age,drinking history,diabetes,hyper-tension,muscle strength grade,area of brain injury,hemispheric stroke,neutrophil count,and al-bumin and potassium levels were risk factors for PSD(P<0.05,P<0.01).The external validation results showed that the area under curve value of the RF model,XGBoost model,SVM model,and our logistic model was 0.883,0.902,0.877,and 0.868,respectively.The distribution of SHAP value showed that drinking history,hypertension and diabetes were positively correlated with PSD risk;Muscle strength was negatively correlated with the risk;Age growth was positively correlated with the risk;Subtentorial lesions showed stronger predictive efficacy than supratentorial lesions and entire lesions;The bilateral and right-sided stroke had higher risk for PSD than the left-sided stroke.Conclusion The model based on the XGBoost model shows best performance in predicting the risk for swallowing disorders in elderly patients after stroke.