Machine learning model based on contrast enhanced CT images for predicting mitotic index in gastrointestinal stromal tumors: a dual-center study

Wenjun DIAO; Xiaobo CHEN; Ximing WANG; Hexiang WANG; Xingyu CHEN; Yanqi HUANG; Zaiyi LIU

Return

Machine learning model based on contrast enhanced CT images for predicting mitotic index in gastrointestinal stromal tumors: a dual-center study

VernacularTitle:基于增强CT图像的机器学习模型预测胃肠道间质瘤核分裂象计数的双中心研究
Author: Wenjun DIAO ¹ ; Xiaobo CHEN ; Ximing WANG ; Hexiang WANG ; Xingyu CHEN ; Yanqi HUANG ; Zaiyi LIU
Author Information

1. 南方医科大学附属广东省人民医院（广东省医学科学院）放射科　广东省医学影像智能分析与应用重点实验室，广州　510080
Publication Type:Journal Article
Keywords: Gastrointestinal stromal tumors; Tomography, X-ray computed; Mitotic Index; Machine learning
From: Chinese Journal of Radiology 2025;59(5):549-557
CountryChina
Language:Chinese
Abstract: Objective:To develop and validate machine learning-based radiomics models using preoperative CT images for individualized prediction of mitotic index (MI) in patients with gastrointestinal stromal tumors (GIST).Methods:The study was a case-control study. The data of 348 GIST patients confirmed by pathology were retrospectively collected from two independent medical centers: the Affiliated Hospital of Qingdao University (center 1) and Shandong Provincial Hospital Affiliated to Shandong First Medical University (center 2), covering the period from January 2013 to June 2018. Patients from center 1 were divided into a training cohort (176 cases) and an internal validation cohort (75 cases) at a ratio of 7∶3 using random sampling. Patients from center 2 served as an independent external validation cohort (97 cases). The primary endpoint was MI, categorized into high MI (145 cases) and low MI (203 cases) groups. Radiomic features were extracted from the portal venous phase images of preoperative contrast-enhanced CT scans. Five machine learning algorithms, including logistic regression, support vector machine, random forest, decision tree, and extreme gradient boosting (XGBoost),were employed to construct MI prediction models. The optimal model was identified using receiver operating characteristic curves. An individualized prediction model was developed by integrating the the optimal machine learning model combined with selected independent clinical factors, and the importance of features was visualized using Shapley Additive Explanation (SHAP) analysis. Patients were followed up, and Kaplan-Meier curves along with log-rank tests were used to evaluate recurrence-free survival (RFS) differences between the predicted high MI and low MI groups.Results:Among the five constructed machine learning models, the XGBoost model demonstrated the best predictive performance, with area under the curve (AUC) of 0.809 (95% CI 0.738-0.872), 0.693 (95% CI 0.571-0.809), and 0.718 (95% CI 0.605-0.822) in the training cohort, internal validation cohort, and external validation cohort, respectively. An individualized prediction model combining the XGBoost model with independent clinical factors (tumor location and tumor size) was developed. The model achieved AUC of 0.843 (95% CI 0.785-0.899), 0.791 (95% CI 0.680-0.894), and 0.777 (95% CI 0.678-0.861) in the training cohort, internal validation cohort, and external validation cohort, respectively. SHAP analysis indicated that radiomic features had the highest predictive impact. In both the training cohort and internal validation cohort, the RFS of patients predicted to be in the high MI group was lower than that of the low MI group, with statistically significant differences ( χ2=14.58, 9.52, both P<0.001). However, there was no statistically significant difference in RFS in the external validation set ( χ2=6.18, P=0.080). Conclusions:The optimal XGBoost model based on radiomic features extracted from preoperative portal venous phase CT images, when combined with clinical factors, can effectively predict the MI of GIST patients.