Recognition of normal fetal echocardiogram based on an explainable denosing deep learning model

Shuhao SONG; Shi ZENG; Ganqiong XU; Yang YANG; Yushan LIU; Pan YANG; Heyi TAN

Return

Recognition of normal fetal echocardiogram based on an explainable denosing deep learning model

VernacularTitle:基于可解释性去噪深度学习模型的正常胎儿超声心动图识别
Author: Shuhao SONG ¹ ; Shi ZENG ; Ganqiong XU ; Yang YANG ; Yushan LIU ; Pan YANG ; Heyi TAN
Author Information

1. 中南大学湘雅二医院超声诊断科，长沙　410011
Publication Type:Journal Article
Keywords: Echocardiography; Deep learning; Fetus
From: Chinese Journal of Ultrasonography 2025;34(6):511-517
CountryChina
Language:Chinese
Abstract: Objective:To evaluate the value of the proposed interpretable denoising deep learning model-grouped sharing convolutional attention-visual transformer（GSCAViT）for classifying normal fetal echocardiograms.Methods:A total of 2 501 images from 358 fetuses who underwent cardiac ultrasound examinations at Xiangya Second Hospital of Central South University from January to November 2024 were retrospectively analyzed. GSCAViT was constructed based on fetal echocardiograms from the three-vessel and trachea view，apical four-chamber view，long-axis view of the aortic arch，bicaval view，left ventricular outflow tract view，three-vessel view and right ventricular outflow tract view were compared with both baseline and improved models in the validation set to assess the performance of the classification echocardiography in terms of accuracy，precision，recall and F1-score. Its generalizability across test sets was assessed using the area under the ROC curve（AUC），sensitivity，specificity and F1-score. The impact of image features was interpreted using SHapley Additive exPlanations（SHAP）.The effectiveness of the GSCA module was compared through visual analysis，image parameter metrics and classification performance.Results:The GSCAViT model achieved classification performance for fetal echocardiograms second only to MaxViT，with an accuracy of 97.1%，precision of 97.1%，recall of 97.0%，and an F1-score of 97.0%. In the E10，E20 and E8 test sets，the AUCs of GSCAViT for the prediction of 7 types of fetal echocardiograms were 0.994，0.928 and 0.932，the sensitivities were 99.4%，81.3% and 72.9%，the specificities were 99.7%，96.8% and 94.8%，the F1-scores were 99.4%，81.3% and 67.6%，respectively. SHAP visualization indicated that the model could identify key structural features within the images. Images processed by the denoising-guided group-sharing convolutional attention module best captured and enhanced important regional features，achieving the highest contrast-to-noise ratio，peak signal-to-noise ratio and optimal classification performance，which demonstrated the module's effectiveness.Conclusions:The proposed GSCAViT model exhibits superior performance in classifying seven types of normal fetal echocardiograms compared to the baseline and some improved models. Furthermore，SHAP visualization enhances the interpretability of the classification results，and visual comparisons，image parameter analyses，as well as classification performance metrics confirming the effectiveness of the denoising-guided group-sharing convolutional attention module in the visual transformer model.