1.Construction and external validation of a machine learning-based prediction model for epilepsy one year after acute stroke.
Wenkao ZHOU ; Fangli ZHAO ; Xingqiang QIU ; Yujuan YANG ; Tingting WANG ; Lingyan HUANG
Chinese Critical Care Medicine 2025;37(5):445-451
OBJECTIVE:
To identify the optimal machine learning algorithm for predicting post-stroke epilepsy (PSE) within one year following acute stroke, establish a nomogram model based on this algorithm, and perform external validation to achieve accurate prediction of secondary epilepsy.
METHODS:
A total of 870 acute stroke patients admitted to the emergency department of Xiang'an Hospital of Xiamen University from June 2019 to June 2023 were enrolled for model development (model group). An external validation cohort of 435 acute stroke patients admitted to the Fifth Hospital of Xiamen during the same period was used to validate the machine learning algorithms and nomogram model. Patients were classified into control and epilepsy groups based on the development of PSE within one year. Clinical and laboratory data, including baseline characteristics, stroke location, vascular status, complications, hematologic parameters, and National Institutes of Health Stroke Scale (NIHSS) score, were collected for analysis. Nine machine learning algorithms such as logistic regression, CN2 rule induction, K-nearest neighbors, adaptive boosting, random forest, gradient boosting, support vector machine, naive Bayes, and neural network were applied to evaluate predictive performance. The area under the curve (AUC) of receiver operator characteristic curve (ROC curve) was used to identify the optimal algorithm. Logistic regression was used to screen risk factors for PSE, and the top 10 predictors were selected to construct the nomogram model. The predictive performance of the model was evaluated using the ROC curve in both the model and validation groups.
RESULTS:
Among the 870 patients in the model group, 29 developed PSE within one year. Among the nine algorithms tested, logistic regression demonstrated the best performance and generalizability, with an AUC of 0.923. Univariate logistic regression identified several risk factors for PSE, including platelet count, white blood cell count, red blood cell count, glycated hemoglobin (HbA1c), C-reactive protein (CRP), triglycerides, high-density lipoprotein (HDL), aspartate aminotransferase (AST), alanine aminotransferase (ALT), activated partial thromboplastin time (APTT), thrombin time, D-dimer, fibrinogen, creatine kinase (CK), creatine kinase-MB (CK-MB), lactate dehydrogenase (LDH), serum sodium, lactic acid, anion gap, NIHSS score, brain herniation, periventricular stroke, and carotid artery plaque. Further multivariate logistic regression analysis showed that white blood cell count, HDL, fibrinogen, lactic acid and brain herniation were independent risk factors [odds ratio (OR) were 1.837, 198.039, 47.025, 11.559, 70.722, respectively, all P < 0.05]. In the external validation group, univariate logistic regression analysis showed that platelet count, white blood cell count, CRP, triacylglycerol, APTT, D-dimer, fibrinogen, CK, CK-MB, LDH, NIHSS score, and cerebral herniation were risk factors for PSE one year after acute stroke. Further multiple logistic regression analysis showed that APTT and cerebral herniation were independent predictors (OR were 0.587 and 116.193, respectively, both P < 0.05). The nomogram model, constructed using 10 key variables-brain herniation, periventricular stroke, carotid artery plaque, white blood cell count, triglycerides, thrombin time, D-dimer, serum sodium, lactic acid, and NIHSS score-achieved an AUC of 0.908 in the model group and 0.864 in the external validation group.
CONCLUSIONS
The logistic regression-based prediction model for epilepsy one year after acute stroke, developed using machine learning algorithms, showed optimal predictive performance. The nomogram model based on the logistic regression-derived predictors showed strong discriminative power and was successfully validated externally, suggesting favorable clinical applicability and generalizability.
Humans
;
Machine Learning
;
Stroke/complications*
;
Nomograms
;
Epilepsy/etiology*
;
Algorithms
;
Male
;
Female
;
Logistic Models
;
Middle Aged
;
Aged
;
Risk Factors
;
Bayes Theorem
2.An evidence-based predictive model for early recurrence risk after hepatocellular carcinoma surgery and external validation study
Wenkao ZHOU ; Fangli ZHAO ; Jiajia CHEN ; Lei CHEN ; Lingyan HUANG ; Yue WANG ; Huimin TANG
Cancer Research and Clinic 2024;36(11):835-842
Objective:To construct an evidence-based prediction model for early recurrence after surgery of hepatocellular carcinoma (HCC) based on Meta-analysis and to do external validation study.Methods:The literatures in Chinese National Knowledge Infrastructure, Wanfang, VIP, Chinese Science Citation Database (CSCD), Chinese Social Science Citation System (CCSCI), PubMed, Web of Science and IEEE databases between January 2019 and December 2023 were searched based on the subject words. According to the inclusion and exclusion criteria, 9 literatures were included to screen the risk factors affecting the early recurrence of HCC. When the same risk factor was found in ≥5 included literatures, Meta-analysis was performed by using Review Manager 5.4.1 software. External validation data were collected from 401 patients with primary HCC who underwent surgery in Liaoning Cancer Hospital between March 2014 and March 2017. The patients were divided into early recurrence group (176 cases) and early non-recurrence group (225 cases) according to whether they relapsed 2 years after surgery. The OR values of all risk factors obtained in the Meta-analysis were converted into modeling, and postoperative early recurrence rate of HCC in the Meta-analysis was used to calculate β 0, and finally the logistic model was obtained. The OR value was incorporated into the logit (P) model, and the morbidity (P) of the external validation data was calculated. Taking the recurrence 2 years after surgery or not as the dependent variable and P as the independent variable, the receiver operating characteristic (ROC) curve was drawn to calculate the area under the curve (AUC). Results:A total of 8 risk factors for early HCC recurrence were screened out from 9 literatures (x 1: alpha-fetoprotein ≥ 400 ng/ml; x 2: tumor number ≥ 2; x 3: the longest tumor diameter ≥ 5 cm; x 4: Barcelona staging B-C; x 5: microvascular invasion; x 6: moderate to low differentiation; x 7: incomplete capsule; x 8: nonanatomic hepatectomy). The Meta-analysis included 1 757 HCC cases, with 960 postoperative early recurrences and an early recurrence rate of 45.36%, finally the β 0 value was -0.201. The predictive model for 2-year recurrence of HCC was constructed and calculated as logit (P) = -0.201+0.835x 1+0.905x 2+0.783x 3+1.008x 4+0.765x 5+0.831x 6+1.533x 7+0.940x 8. Analysis of variance by external validation data showed that the differences in ascites, alpha-fetoprotein, tumor number, tumor diameter, Barcelona staging, microvascular invasion, tumor differentiation degree, capsule invasion, resection type, and systemic inflammation index were statistically significant between early recurrence group and early non-recurrence group (all P < 0.05). ROC curve analysis showed that AUC of postoperative early recurrence of HCC predicted by the model was 0.718, (95% CI: 0.689-0.753), the optimal cut-off value was 3.11, the Yoden index was 0.288, the sensitivity was 69.32%, and the specificity was 69.56%. Conclusions:The evidence-based prediction model constructed based on Meta-analysis for postoperative early recurrence of HCC has a high predictive value. However, further verification and optimization with big data is still needed.

Result Analysis
Print
Save
E-mail