Construction and analysis of machine learning models for preoperative prediction of glioma grading and isocitrate dehydrogenase mutation status

Yuting WANG; Junle ZHU; Shuang QIN; Saifei SUN; Xin ZHANG; Qi LÜ

Return

Construction and analysis of machine learning models for preoperative prediction of glioma grading and isocitrate dehydrogenase mutation status

VernacularTitle:术前脑胶质瘤分级与异柠檬酸脱氢酶突变状态机器学习预测模型构建及分析
Author: Yuting WANG ¹ ; Junle ZHU ² ; Shuang QIN ³ ; Saifei SUN ⁴ ; Xin ZHANG ⁵ ; Qi LÜ ³
Author Information

1. Department of Radiology, Hospital of Sihong, Suqian 223900, Jiangsu, China.
2. Department of Neurosurgery, Tongji Hospital Affiliated to Tongji University, Shanghai 200065, China.
3. Department of Radiology, Xuhui District Central Hospital, Shanghai 200237, China.
4. Department of Radiology, Tongji Hospital Affiliated to Tongji University, Shanghai 200065, China.
5. School of Life Science and Technology, Shanghai Tech University, Shanghai 200031, China.
Publication Type:AI4M
Keywords: glioma grading; isocitrate dehydrogenase mutation; machine learning; imaging features; inflammation indicator
From: Chinese Journal of Clinical Medicine 2026;33(1):3-15
CountryChina
Language:Chinese
Abstract: Objective To construct machine learning models based on preoperative inflammatory and radiological features for the prediction of glioma grading and isocitrate dehydrogenase (IDH) mutation status, and to analyze application values of these models and identify the optimal predictive models. Methods A retrospective analysis was conducted on the data of pathologically confirmed glioma patients admitted to Tongji Hospital Affiliated to Tongji University from March 2019 to March 2023. LASSO regression was used to screen feature variables, and predictive models were constructed based on logistic regression (LR), random forest (RF), support vector machine (SVM), gradient boosting decision tree (XGBoost) and K-nearest neighbor (KNN) algorithms. The model performance was comprehensively evaluated using metrics including discrimination ability, area under the precision-recall curve (AUC), accuracy, F1 score and Brier score. The DeLong test was adopted to compare the AUC values among different models; Friedman rank-sum test was used to determine the overall performance differences of the models, with the Nemenyi test applied for multiple comparison correction. Results In the task of glioma grading prediction, the LR model achieved the highest comprehensive score (0.726), and no significant difference was observed between the LR model and the other four models; age was positively correlated with glioma grading (P=0.003). In the task of IDH mutation status prediction, the XGBoost model obtained the highest comprehensive score (0.832), which was superior to the LR (0.762, P=0.035) and KNN models (0.754, P=0.025), while no statistical differences were found between the XGBoost model and the RF or SVM models. Conclusions The LR model for glioma grading prediction and XGBoost model for IDH mutation prediction constructed based on a task-oriented strategy achieve a favorable interpretability while ensuring optimized performance, thereby providing reliable decision support for the individualized diagnosis and treatment of glioma.