Comparative evaluation of deep learning architectures, including UNet, TransUNet, and MIST, for left atrium segmentation in cardiac computed tomography of congenital heart diseases

Seoyeong YUN; Jooyoung CHOI

Return

Comparative evaluation of deep learning architectures, including UNet, TransUNet, and MIST, for left atrium segmentation in cardiac computed tomography of congenital heart diseases

Author: Seoyeong YUN ¹ ; Jooyoung CHOI
Author Information

1. Ewha Womans University College of Medicine, Seoul, Korea
Publication Type:Original article
From:The Ewha Medical Journal 2025;48(2):e33-
CountryRepublic of Korea
Language:English
Abstract: Purpose:This study compares 3 deep learning models (UNet, TransUNet, and MIST) for left atrium (LA) segmentation of cardiac computed tomography (CT) images from patients with congenital heart disease (CHD). It investigates how architectural variations in the MIST model, such as spatial squeeze-and-excitation attention, impact Dice score and HD95.
Methods:We analyzed 108 publicly available, de-identified CT volumes from the ImageCHD dataset. Volumes underwent resampling, intensity normalization, and data augmentation. UNet, TransUNet, and MIST models were trained using 80% of 97 cases, with the remaining 20% employed for validation. Eleven cases were reserved for testing. Performance was evaluated using the Dice score (measuring overlap accuracy) and HD95 (reflecting boundary accuracy). Statistical comparisons were performed via one-way repeated measures analysis of variance.
Results:MIST achieved the highest mean Dice score (0.74; 95% confidence interval, 0.67–0.81), significantly outperforming TransUNet (0.53; P<0.001) and UNet (0.49; P<0.001). Regarding HD95, TransUNet (9.09 mm) and MIST (5.77 mm) similarly outperformed UNet (27.49 mm; P<0.0001). In ablation experiments, the inclusion of spatial attention did not further enhance the MIST model’s performance, suggesting redundancy with existing attention mechanisms. However, the integration of multi-scale features and refined skip connections consistently improved segmentation accuracy and boundary delineation.
Conclusion:MIST demonstrated superior LA segmentation, highlighting the benefits of its integrated multi-scale features and optimized architecture. Nevertheless, its computational overhead complicates practical clinical deployment. Our findings underscore the value of advanced hybrid models in cardiac imaging, providing improved reliability for CHD evaluation. Future studies should balance segmentation accuracy with feasible clinical implementation.