- VernacularTitle:基于类相关图卷积网络的单细胞蛋白质定位方法
- Author:
Hao-Yang TANG
1
;
Xin-Yue YAO
1
;
Meng-Meng WANG
1
;
Si-Cong YANG
1
Author Information
- Publication Type:Journal Article
- Keywords: single-cell protein subcellular localization; weak supervision; graph convolutional network; class perception; pseudo label
- From: Progress in Biochemistry and Biophysics 2025;52(9):2417-2427
- CountryChina
- Language:Chinese
- Abstract: ObjectiveThis study proposes a novel single-cell protein localization method based on a class perception graph convolutional network (CP-GCN) to overcome several critical challenges in protein microscopic image analysis, including the scarcity of cell-level annotations, inadequate feature extraction, and the difficulty in achieving precise protein localization within individual cells. The methodology involves multiple innovative components designed to enhance both feature extraction and localization accuracy. MethodsFirst, a class perception module (CPM) is developed to effectively capture and distinguish semantic features across different subcellular categories, enabling more discriminative feature representation. Building upon this, the CP-GCN network is designed to explore global features of subcellular proteins in multicellular environments. This network incorporates a category feature-aware module to extract protein semantic features aligned with label dimensions and establishes a subcellular relationship mining module to model correlations between different subcellular structures. By doing so, it generates co-occurrence embedding features that encode spatial and contextual relationships among subcellular locations, thereby improving feature representation. To further refine localization, a multi-scale feature analysis approach is employed using the K-means clustering algorithm, which classifies multi-scale features within each subcellular category and generates multi-cell class activation maps (CAMs). These CAMs highlight discriminative regions associated with specific subcellular locations, facilitating more accurate protein localization. Additionally, a pseudo-label generation strategy is introduced to address the lack of annotated single-cell data. This strategy segments multicellular images into single-cell images and assigns reliable pseudo-labels based on the CAM-predicted regions, ensuring high-quality training data for single-cell analysis. Under a transfer learning framework, the model is trained to achieve precise single-cell-level protein localization, leveraging both the extracted features and pseudo-labels for robust performance. ResultsExperimental validation on multiple single-cell test datasets demonstrates that the proposed method significantly outperforms existing approaches in terms of robustness and localization accuracy. Specifically, on the Kaggle 2021 dataset, the method achieves superior mean average precision (mAP) metrics across 18 subcellular categories, highlighting its effectiveness in diverse protein localization tasks. Visualization of the generated CAM results further confirms the model’s capability to accurately localize subcellular proteins within individual cells, even in complex multicellular environments. ConclusionThe integration of the CP-GCN network with a pseudo-labeling strategy enables the proposed method to effectively capture heterogeneous cellular features in protein images and achieve precise single-cell protein localization. This advancement not only addresses key limitations in current protein image analysis but also provides a scalable and accurate solution for subcellular protein studies, with potential applications in biomedical research and diagnostic imaging. The success of this method underscores the importance of combining advanced deep learning architectures with innovative training strategies to overcome data scarcity and improve localization performance in biological image analysis. Future work could explore the extension of this framework to other types of microscopic imaging and its application in large-scale protein interaction studies.