Ability of artificial intelligence system to predict invasion depth and differentiation status of early gastric cancer: performance in single-center and multi-center videos
10.3760/cma.j.cn321463-20231206-00514
- VernacularTitle:人工智能系统预测早期胃癌浸润深度和分化状态的能力
- Author:
Ting YANG
1
;
Zehua DONG
1
;
Xiao TAO
1
;
Lianlian WU
1
;
Honggang YU
1
Author Information
1. 武汉大学人民医院消化内科,武汉 430060
- Publication Type:Journal Article
- Keywords:
Artificial intelligence;
Early gastric cancer;
Invasion depth;
Differentiation status
- From:
Chinese Journal of Digestive Endoscopy
2025;42(6):452-461
- CountryChina
- Language:Chinese
-
Abstract:
Objective:To evaluate the ability of ENDOANGEL artificial intelligence system to predict invasion depth and differentiation status of early gastric cancer using more diverse multi-center videos, and to test the performance of the new system upgraded from ENDOANGEL.Methods:Based on the completed 2020 man-machine competition for early gastric cancer diagnosis using single-center videos, the second man-machine competition was conducted in 2022, involving 30 endoscopists from 30 hospitals across 10 Chinese provinces. A multi-center video cohort was retrospectively collected from 12 institutions in 8 provinces/municipalities in China. The study proceeded in 3 stages. First, the ENDOANGEL was re-tested on multi-center videos, its performance on single and multi-center videos was compared, then the ENDOANGEL was upgraded to ENDOANGEL-2022. Second, the second man-machine competition was conducted between ENDOANGEL-2022 and 30 endoscopists using multi-center videos, and the performance between ENDOANGEL-2022, ENDOANGEL and endoscopists on multi-center videos were compared. Third, the ENDOANGEL-2022 was re-tested on the single-center videos previously collected in 2020, its performance on single and multi-center videos was also compared.Results:Compared with the performance on single-center videos, the sensitivity of ENDOANGEL for predicting submucosal invasion of early gastric cancer decreased significantly [18.18% (2/11) VS 70.00% (7/10), P=0.030], but demonstrated comparable ability to predict undifferentiated type of early gastric cancer ( P>0.05). On multi-center videos, in the respect of predicting submucosal invasion of early gastric cancer, the sensitivity of ENDOANGEL-2022 was higher than that of ENDOANGEL [40.00% (4/10) VS 18.18% (2/11), P=0.361], but inferior to that of 30 endoscopists [40.00% VS 52.04% (95% CI: 43.70%-60.38%), P<0.001]. The specificity of ENDOANGEL-2022 was lower than that of ENDOANGEL [82.86% (29/35) VS 100.00% (34/34), χ2=4.41, P=0.036] and higher than that of 30 endoscopists [82.86% VS 68.97% (95% CI: 60.83%-77.11%), P=0.018], the accuracy of ENDOANGEL-2022 was lower than that of ENDOANGEL [73.33% (33/45) VS 80.00% (36/45), χ2=0.56, P=0.455] and higher than that of 30 endoscopists [73.33% VS 65.30% (95% CI: 60.61%-69.99%), P=0.018]. In the respect of predicting undifferentiated type of early gastric cancer, the sensitivity of ENDOANGEL-2022 was higher than that of ENDOANGEL [71.43% (5/7) VS 57.14% (4/7), P>0.999] and 30 endoscopists [71.43% VS 63.11% (95% CI: 55.58%-70.64%), P=0.031], the specificity of ENDOANGEL-2022 was lower than that of ENDOANGEL [76.32% (29/38) VS 78.95% (30/38), χ2=0.08, P=0.783] and higher than that of 30 endoscopists [76.32% VS 65.27% (95% CI: 59.10%-71.44%), P=0.004],the accuracy of ENDOANGEL-2022 was similar to that of ENDOANGEL [75.56% (34/45) VS 75.56% (34/45), χ2=0.00, P>0.999] and higher than that of 30 endoscopists [75.56% VS 65.10% (95% CI: 59.96%- 70.24%), P<0.001]. Compared with performance in single center videos, the sensitivity [40.00% VS 60.00%(6/10), P=0.656], specificity [82.86% VS 93.75% (15/16), χ2=0.37, P=0.542] and accuracy [73.33% VS 80.77% (21/26), χ2=0.50, P=0.479] of ENDOANGEL-2022 for predicting submucosal invasion of early gastric cancer decreased; in predicting undifferentiated type of early gastric cancer, the sensitivity of ENDOANGEL-2022 increased [71.43% VS 37.50% (3/8), P=0.315], while the specificity [76.32% VS 100.00% (18/18), χ2=3.48, P=0.062] and accuracy [75.56% VS 80.77% (21/26), χ2=0.26, P=0.612] decreased. Conclusion:Multi-center cases introduce greater heterogeneity that may reduce artificial intelligence prediction accuracy, but the artificial intelligence system still outperforms endoscopists.