Prediction of Recurrence Risk of Diffuse Large B-cell Lymphoma based on SMOTE-ENN and Deep Forest
10.11783/j.issn.1002-3674.2025.01.012
- VernacularTitle:基于SMOTE-ENN和深度森林的弥漫大B细胞淋巴瘤复发风险预测
- Author:
Yu QIAO
1
;
Yanbo ZHANG
;
Hongmei YU
Author Information
1. 山西医科大学公共卫生学院卫生统计教研室(030001)
- Publication Type:Journal Article
- Keywords:
Diffuse large B-cell lymphoma;
Unbalanced data;
Recurrence prediction;
Deep forest
- From:
Chinese Journal of Health Statistics
2025;42(1):67-72
- CountryChina
- Language:Chinese
-
Abstract:
Objective To construct a 2-year relapse risk prediction model for 498 patients diagnosed with diffuse large B-cell lymphoma(DLBCL)who achieved complete response(CR)following treatment at the hematology department of a cancer hospital in Shanxi Province between 2011 and 2020,providing a reference for clinical management.Methods The least absolute shrinkage and selection operator(LASSO)feature selection algorithm,combined with clinical expertise,was first used to identify 21 significant variables influencing the 2-year relapse rate in DLBCL patients with CR.To address data imbalance,synthetic minority oversampling technique(SMOTE)and synthetic minority oversampling technique and edited nearest neighbor(SMOTE-ENN)were applied.Relapse predictions were conducted using seven classifiers on both the original and balanced datasets.The deep forest(DF)algorithm was then employed to build the relapse risk prediction model.Model performance was evaluated using accuracy,precision,sensiti vity/recall,specificity,F1-score,and G-means,while calibration was assessed using the Brier score.Results The deep forest algorithm,when combined with the SMOTE-ENN method for data imbalance,achieved the best performance(accuracy=0.932,precision=0.949,recall=0.944,specificity=0.910,F1-score=0.946,G-means=0.926,Brier score=0.068).Conclusion This study successfully combines the SMOTE-ENN technique with the deep forest classifier to predict 2-year relapse risk in DLBCL patients who achieved CR.The model demonstrates excellent performance and meets expectations.