1.Development and validation of a recognition and classification system for portal hypertensive gastropathy based on deep learning
Haowen GU ; Jie YANG ; Yong XIAO ; Xinyue WAN ; Wei HU ; Xianmu XIE ; Dingpeng HUANG ; Chengming YAO ; Xinliang SHI ; Shiqian LIU ; Li HUANG ; Chi ZHANG ; Biqing ZHENG ; Mingkai CHEN
Chinese Journal of Digestive Endoscopy 2025;42(10):789-795
Objective:To develop a deep learning-based system for real-time recognition and classification of portal hypertensive gastropathy (PHG) and evaluate its ability to assist junior endoscopists.Methods:A total of 2 848 gastroscopy images from 832 patients with liver cirrhosis were selected from Digestive Endoscopy Center databases of Renmin Hospital of Wuhan University, Wuhan Hospital of Traditional Chinese and Western Medicine, and the Second Hospital of Jingzhou from January 2015 to October 2023. This system referred to 3 endoscopic features of Baveno Ⅱ scoring system. Three models were developed respectively for gastric antral vascular ectasia (GAVE), mosaic-like pattern (MLP), and red marks (RM). The specific classification references were as follows: (1) GAVE model: 0 no, 1 yes; (2) MLP model: 0 no, 1 mild, 2 severe; (3) RM model: 0 no, 1 isolated, 2 fused. The classification results for endoscopic characteristics of PHG of 3 endoscopy experts were taken as the gold standard. The yolov8-m model was used for training. The training dataset, validation dataset, and test dataset were allocated at a ratio of 8∶1∶1. The test dataset was used to evaluate the performance of models and their auxiliary effects on endoscopists. The accuracy, recall, precision, specificity and Kappa coefficient were calculated. Results:The accuracy, recall, specificity of GAVE model were 96.0% (48/50), 87.5% (7/8) and 97.6% (41/42). There was no significant difference between its accuracy and the gold standard ( χ2=316.226, P=1.000). The precision of GAVE1 and GAVE0 were 87.5% (7/8) and 97.6% (41/42) respectively. The accuracy of MLP model was 84.1% (132/157), and there was no significant difference compared with the gold standard ( χ2=3.286, P=0.193). The precision and recall of MLP2 were 88.2% (15/17) and 75.0% (15/20). The precision and recall of MLP1 were 77.9% (60/77) and 88.2% (60/68). The precision and recall of MLP0 were 90.5% (57/63) and 82.6% (57/69). The accuracy of RM model was 87.9% (123/140), and there was no significant difference compared with the gold standard ( χ2=2.891, P=0.409). The precision and recall of RM2 were 94.7% (18/19) and 78.3% (18/23). The precision and recall of RM1 were 72.2% (26/36) and 81.3% (26/32). The precision and recall of RM0 were 92.9% (79/85) and 92.9% (79/85). The mean accuracy of the three junior endoscopists, with and without the assistance of the GAVE model, MLP model, and RM model, respectively increased from 95.3% to 99.3%, from 83.9% to 91.9%, and from 81.9% to 83.1%. The overall consistency analysis of the 3 junior endoscopists with the gold standard indicated that the consistency of the GAVE model before and after assistance was extremely strong (both an overall Kappa of 1.000); the consistency before assistance of the MLP model was moderate (with an overall Kappa of 0.601), which increased to extremely strong after assistance (with an overall Kappa of 0.964); and the consistency of the RM model before and after assistance was also relatively strong (with an overall Kappa of 0.792 before and 0.798 after). Conclusion:The deep learning system accurately identifies and classifies PHG features and significantly enhances diagnostic performance of junior endoscopists.
2.Development and validation of a recognition and classification system for portal hypertensive gastropathy based on deep learning
Haowen GU ; Jie YANG ; Yong XIAO ; Xinyue WAN ; Wei HU ; Xianmu XIE ; Dingpeng HUANG ; Chengming YAO ; Xinliang SHI ; Shiqian LIU ; Li HUANG ; Chi ZHANG ; Biqing ZHENG ; Mingkai CHEN
Chinese Journal of Digestive Endoscopy 2025;42(10):789-795
Objective:To develop a deep learning-based system for real-time recognition and classification of portal hypertensive gastropathy (PHG) and evaluate its ability to assist junior endoscopists.Methods:A total of 2 848 gastroscopy images from 832 patients with liver cirrhosis were selected from Digestive Endoscopy Center databases of Renmin Hospital of Wuhan University, Wuhan Hospital of Traditional Chinese and Western Medicine, and the Second Hospital of Jingzhou from January 2015 to October 2023. This system referred to 3 endoscopic features of Baveno Ⅱ scoring system. Three models were developed respectively for gastric antral vascular ectasia (GAVE), mosaic-like pattern (MLP), and red marks (RM). The specific classification references were as follows: (1) GAVE model: 0 no, 1 yes; (2) MLP model: 0 no, 1 mild, 2 severe; (3) RM model: 0 no, 1 isolated, 2 fused. The classification results for endoscopic characteristics of PHG of 3 endoscopy experts were taken as the gold standard. The yolov8-m model was used for training. The training dataset, validation dataset, and test dataset were allocated at a ratio of 8∶1∶1. The test dataset was used to evaluate the performance of models and their auxiliary effects on endoscopists. The accuracy, recall, precision, specificity and Kappa coefficient were calculated. Results:The accuracy, recall, specificity of GAVE model were 96.0% (48/50), 87.5% (7/8) and 97.6% (41/42). There was no significant difference between its accuracy and the gold standard ( χ2=316.226, P=1.000). The precision of GAVE1 and GAVE0 were 87.5% (7/8) and 97.6% (41/42) respectively. The accuracy of MLP model was 84.1% (132/157), and there was no significant difference compared with the gold standard ( χ2=3.286, P=0.193). The precision and recall of MLP2 were 88.2% (15/17) and 75.0% (15/20). The precision and recall of MLP1 were 77.9% (60/77) and 88.2% (60/68). The precision and recall of MLP0 were 90.5% (57/63) and 82.6% (57/69). The accuracy of RM model was 87.9% (123/140), and there was no significant difference compared with the gold standard ( χ2=2.891, P=0.409). The precision and recall of RM2 were 94.7% (18/19) and 78.3% (18/23). The precision and recall of RM1 were 72.2% (26/36) and 81.3% (26/32). The precision and recall of RM0 were 92.9% (79/85) and 92.9% (79/85). The mean accuracy of the three junior endoscopists, with and without the assistance of the GAVE model, MLP model, and RM model, respectively increased from 95.3% to 99.3%, from 83.9% to 91.9%, and from 81.9% to 83.1%. The overall consistency analysis of the 3 junior endoscopists with the gold standard indicated that the consistency of the GAVE model before and after assistance was extremely strong (both an overall Kappa of 1.000); the consistency before assistance of the MLP model was moderate (with an overall Kappa of 0.601), which increased to extremely strong after assistance (with an overall Kappa of 0.964); and the consistency of the RM model before and after assistance was also relatively strong (with an overall Kappa of 0.792 before and 0.798 after). Conclusion:The deep learning system accurately identifies and classifies PHG features and significantly enhances diagnostic performance of junior endoscopists.

Result Analysis
Print
Save
E-mail