Cell nucleus segmentation in pathological images based on text annotations and Transformer
10.3969/j.issn.1005-202X.2025.10.009
- VernacularTitle:基于文本注释和Transformer的病理图像细胞核分割
- Author:
Jinling CHEN
1
;
Yu CHEN
;
Zhuowei TANG
;
Jihong WEI
;
Qi KE
;
Yuzhu JI
;
Ziqing GAO
Author Information
1. 西南石油大学电气信息学院,四川 成都 610500
- Publication Type:Journal Article
- Keywords:
pathological image;
cell nucleus segmentation;
text annotation;
feature fusion
- From:
Chinese Journal of Medical Physics
2025;42(10):1328-1336
- CountryChina
- Language:Chinese
-
Abstract:
A VLi-net based cell nucleus segmentation method integrating convolutional neural networks(CNN)and Vision Transformer(ViT)is proposed to address the limitation that the U-Net with CNN as its backbone is only proficient in capturing local features and has a restricted receptive field.Firstly,to mitigate challenges such as high cost of data annotation and insufficient annotated data,text annotations are introduced to enhance the network's understanding of image information.Secondly,to improve the segmentation performance of VLi-net,ViT and CNN are combined to fully extract global and local features,with multi-receptive field convolution features incorporating into the ViT structure for effectively mitigating the issues of limited local information interaction and single feature representation in ViT.Finally,an interactive fusion module(ViFusion)is used to efficiently fuse the multi-level features from the CNN and ViT branches.Experimental results show that VLi-net achieves a Dice coefficient of 80.85%and a mean intersection over union(MIoU)of 66.83%on the MoNuSeg dataset,obtains a Dice coefficient of 80.53%and a MIoU of 67.54%on the DSB-2018 dataset,and has a Dice coefficient of 86.87%and a MIoU of 77.44%on the TNBC dataset.These findings confirm that VLi-net outperforms other methods across multiple experimental metrics.