1.Research Progress and Prospects of Minimally Invasive Surgical Instrument Segmentation Methods Based on Artificial Intelligence.
Weimin CHENG ; Xiaohua WU ; Jing XIONG
Chinese Journal of Medical Instrumentation 2025;49(1):15-23
With the development of artificial intelligence technology and the growing demand for minimally invasive surgery, the intelligentization of minimally invasive surgery has become a current research hotspot. Surgical instrument segmentation is a highly promising technology that can enhance the performance of minimally invasive endoscopic imaging systems, surgical video analysis systems, and other related systems. This article summarizes the semantic and instance segmentation methods of minimally invasive surgical instruments based on deep learning, deeply analyzes the supervision methods of training algorithms, network structure improvements, and attention mechanisms, and then discusses the methods based on the Segment Anything Model. Given that deep learning methods have extremely high requirements for data, current data augmentation methods have also been explored. Finally, a summary and outlook on instrument segmentation technology are provided.
Artificial Intelligence
;
Minimally Invasive Surgical Procedures/instrumentation*
;
Algorithms
;
Deep Learning
;
Humans
;
Image Processing, Computer-Assisted
2.Three-Dimensional Reconstruction Technique and Its Application of Binocular Endoscopic Images Based on Deep Learning.
Lina HUANG ; Shenglin LIU ; Qingmin FENG ; Haolong JIN ; Qiang ZHANG
Chinese Journal of Medical Instrumentation 2025;49(2):161-168
The clinical application of binocular endoscope relies primarily on the visual system of physicians to create a three-dimensional effect, but it cannot provide accurate depth information. The utilization of 3D reconstruction technology in binocular endoscopy can facilitate the recovery of image depth information, and the application of deep learning-based 3D reconstruction technology can significantly improve the accuracy and real-time performance of reconstruction results, making it widely applicable in the realm of minimally invasive surgery. This paper aims to explore the key technologies and implementation methods of deep learning based 3D reconstruction for binocular endoscopic images, and seeks to outline strategies for enhancing the quality of 3D reconstruction in endoscopic images, providing guidance for sustainable development of binocular endoscopic image reconstruction technology in clinical settings. This will assist in the application of minimally invasive surgery and contribute to meeting the demands of precision medicine.
Deep Learning
;
Imaging, Three-Dimensional/methods*
;
Humans
;
Endoscopy/methods*
;
Image Processing, Computer-Assisted/methods*
;
Minimally Invasive Surgical Procedures
3.Review on Applications of Deep Learning in Digital Pathological Images.
Chaoyi LYU ; Yuan XIE ; Lu QIU ; Lu ZHAO ; Jun ZHAO
Chinese Journal of Medical Instrumentation 2025;49(3):237-243
Computer-assisted methods for pathological image analysis can improve doctor's efficiency of image reading and diagnostic accuracy, effectively addressing the shortage of pathology diagnostic manpower. With the rapid development of artificial intelligence and digital pathology, deep learning technology has spurred a wealth of research in the field of histopathology. This article reviews the various applications of deep learning in digital pathological image analysis, such as pathological image segmentation, cancer auxiliary diagnosis, and cancer prognosis prediction, and discusses the challenges and solutions in its application. Furthermore, it predicts future trends in deep learning for pathological image analysis and proposes potential research directions.
Deep Learning
;
Humans
;
Image Processing, Computer-Assisted/methods*
;
Artificial Intelligence
;
Neoplasms
4.Seeing the macro in the micro: a diffusion model-based approach for style transfer in cellular images.
Jiayi CAI ; Yong HE ; Feng LIU ; Byung-Ho KANG ; Xuping FENG
Journal of Zhejiang University. Science. B 2025;26(6):609-612
The internal structures of cells as the basic units of life are a major wonder of the microscopic world. Cellular images provide an intriguing window to help explore and understand the composition and function of these structures. Scientific imagery combined with artistic expression can further expand the potential of imaging in educational dissemination and interdisciplinary applications. This study presents an innovative diffusion model-based approach for style transfer in cellular images, combining scientific rigor with artistic expression. By leveraging training-free large-scale pre-trained diffusion models, the proposed method integrates the intricate morphological and textural features of cellular images with diverse artistic styles. Key techniques such as the inversion of denoising diffusion implicit models (DDIMs), adaptive instance normalization (AdaIN), self-attention style injection, and attention temperature scaling ensure the preservation of cellular structures while enhancing visual expressiveness. The results showcase the potential of this strategy for interdisciplinary applications, enriching both the visualization and educational dissemination of cellular imagery through compelling storytelling and aesthetic appeal.
Humans
;
Image Processing, Computer-Assisted/methods*
;
Cells
;
Diffusion
5.A low-dose CT reconstruction method using sub-pixel anisotropic diffusion.
Shizhou TANG ; Ruolan SU ; Shuting LI ; Zhenzhen LAI ; Jinhong HUANG ; Shanzhou NIU
Journal of Southern Medical University 2025;45(1):162-169
OBJECTIVES:
We present a new low-dose CT reconstruction method using sub-pixel and anisotropic diffusion.
METHODS:
The sub-pixel intensity values and their second-order differences were obtained using linear interpolation techniques, and the new gradient information was then embedded into an anisotropic diffusion process, which was introduced into a penalty-weighted least squares model to reduce the noise in low-dose CT projection data. The high-quality CT image was finally reconstructed using the classical filtered back-projection (FBP) algorithm from the estimated data.
RESULTS:
In the Shepp-Logan phantom experiments, the structural similarity (SSIM) index of the CT image reconstructed by the proposed algorithm, as compared with FBP, PWLS-Gibbs and PWLS-TV algorithms, was increased by 28.13%, 5.49%, and 0.91%, the feature similarity (FSIM) index was increased by 21.08%, 1.78%, and 1.36%, and the root mean square error (RMSE) was reduced by 69.59%, 18.96%, and 3.90%, respectively. In the digital XCAT phantom experiments, the SSIM index of the CT image reconstructed by the proposed algorithm, as compared with FBP, PWLS-Gibbs and PWLS-TV algorithms, was increased by 14.24%, 1.43% and 7.89%, the FSIM index was increased by 9.61%, 1.78% and 5.66%, and the RMSE was reduced by 26.88%, 9.41% and 18.39%, respectively. In clinical experiments, the SSIM index of the image reconstructed using the proposed algorithm was increased by 19.24%, 15.63% and 3.68%, the FSIM index was increased by 4.30%, 2.92% and 0.43%, and the RMSE was reduced by 44.60%, 36.84% and 15.22% in comparison with FBP, PWLS-Gibbs and PWLS-TV algorithms, respectively.
CONCLUSIONS
The proposed method can effectively reduce the noises and artifacts while maintaining the structural details in low-dose CT images.
Tomography, X-Ray Computed/methods*
;
Algorithms
;
Phantoms, Imaging
;
Anisotropy
;
Image Processing, Computer-Assisted/methods*
;
Humans
;
Radiation Dosage
6.A sparse-view cone-beam CT reconstruction algorithm based on bidirectional flow field- guided projection completion.
Wenwei LI ; Zerui MAO ; Yongbo WANG ; Zhaoying BIAN ; Jing HUANG
Journal of Southern Medical University 2025;45(2):395-408
OBJECTIVES:
We propose a sparse-view cone-beam CT reconstruction algorithm based on bidirectional flow field guided projection completion (BBC-Recon) to solve the ill-posed inverse problem in sparse-view cone-beam CT imaging.
METHODS:
The BBC-Recon method consists of two main modules: the projection completion module and the image restoration module. Based on flow field estimation, the projection completion module, through the designed bidirectional and multi-scale correlators, fully calculates the correlation information and redundant information among projections to precisely guide the generation of bidirectional flow fields and missing frames, thus achieving high-precision completion of missing projections and obtaining pseudo complete projections. The image restoration module reconstructs the obtained pseudo complete projections and then refines the image to remove the residual artifacts and further improve the image quality.
RESULTS:
The experimental results on the public datasets of Mayo Clinic and Guilin Medical University showed that in the case of a 4-fold sparse angle, compared with the suboptimal method, the BBC-Recon method increased the PSNR index by 1.80% and the SSIM index by 0.29%, and reduced the RMSE index by 4.12%; In the case of an 8-fold sparse angle, the BBC-Recon method increased the PSNR index by 1.43% and the SSIM index by 1.49%, and reduced the RMSE index by 0.77%.
CONCLUSIONS
The BBC-Recon algorithm fully exploits the correlation information between projections to allow effective removal of streak artifacts while preserving image structure information, and demonstrates significant advantages in maintaining inter-slice consistency.
Algorithms
;
Cone-Beam Computed Tomography/methods*
;
Image Processing, Computer-Assisted/methods*
;
Humans
7.A segmented backprojection tensor degradation feature encoding model for motion artifacts correction in dental cone beam computed tomography.
Zhixiong ZENG ; Yongbo WANG ; Zongyue LIN ; Zhaoying BIAN ; Jianhua MA
Journal of Southern Medical University 2025;45(2):422-436
OBJECTIVES:
We propose a segmented backprojection tensor degradation feature encoding (SBP-MAC) model for motion artifact correction in dental cone beam computed tomography (CBCT) to improve the quality of the reconstructed images.
METHODS:
The proposed motion artifact correction model consists of a generator and a degradation encoder. The segmented limited-angle reconstructed sub-images are stacked into the tensors and used as the model input. A degradation encoder is used to extract spatially varying motion information in the tensor, and the generator's skip connection features are adaptively modulated to guide the model for correcting artifacts caused by different motion waveforms. The artifact consistency loss function was designed to simplify the learning task of the generator.
RESULTS:
The proposed model could effectively remove motion artifacts and improve the quality of the reconstructed images. For simulated data, the proposed model increased the peak signal-to-noise ratio by 8.28%, increased the structural similarity index measurement by 2.29%, and decreased the root mean square error by 23.84%. For real clinical data, the proposed model achieved the highest expert score of 4.4221 (against a 5-point scale), which was significantly higher than those of all the other comparison methods.
CONCLUSIONS
The SBP-MAC model can effectively extract spatially varying motion information in the tensors and achieve adaptive artifact correction from the tensor domain to the image domain to improve the quality of reconstructed dental CBCT images.
Cone-Beam Computed Tomography/methods*
;
Artifacts
;
Humans
;
Motion
;
Image Processing, Computer-Assisted/methods*
;
Signal-To-Noise Ratio
;
Algorithms
8.A multi-scale supervision and residual feedback optimization algorithm for improving optic chiasm and optic nerve segmentation accuracy in nasopharyngeal carcinoma CT images.
Jinyu LIU ; Shujun LIANG ; Yu ZHANG
Journal of Southern Medical University 2025;45(3):632-642
OBJECTIVES:
We propose a novel deep learning segmentation algorithm (DSRF) based on multi-scale supervision and residual feedback strategy for precise segmentation of the optic chiasm and optic nerves in CT images of nasopharyngeal carcinoma (NPC) patients.
METHODS:
We collected 212 NPC CT images and their ground truth labels from SegRap2023, StructSeg2019 and HaN-Seg2023 datasets. Based on a hybrid pooling strategy, we designed a decoder (HPS) to reduce small organ feature loss during pooling in convolutional neural networks. This decoder uses adaptive and average pooling to refine high-level semantic features, which are integrated with primary semantic features to enable network learning of finer feature details. We employed multi-scale deep supervision layers to learn rich multi-scale and multi-level semantic features under deep supervision, thereby enhancing boundary identification of the optic chiasm and optic nerves. A residual feedback module that enables multiple iterations of the network was designed for contrast enhancement of the optic chiasm and optic nerves in CT images by utilizing information from fuzzy boundaries and easily confused regions to iteratively refine segmentation results under supervision. The entire segmentation framework was optimized with the loss from each iteration to enhance segmentation accuracy and boundary clarity. Ablation experiments and comparative experiments were conducted to evaluate the effectiveness of each component and the performance of the proposed model.
RESULTS:
The DSRF algorithm could effectively enhance feature representation of small organs to achieve accurate segmentation of the optic chiasm and optic nerves with an average DSC of 0.837 and an ASSD of 0.351. Ablation experiments further verified the contributions of each component in the DSRF method.
CONCLUSIONS
The proposed deep learning segmentation algorithm can effectively enhance feature representation to achieve accurate segmentation of the optic chiasm and optic nerves in CT images of NPC.
Humans
;
Tomography, X-Ray Computed/methods*
;
Optic Chiasm/diagnostic imaging*
;
Optic Nerve/diagnostic imaging*
;
Algorithms
;
Nasopharyngeal Carcinoma
;
Deep Learning
;
Nasopharyngeal Neoplasms/diagnostic imaging*
;
Neural Networks, Computer
;
Image Processing, Computer-Assisted/methods*
9.A low-dose CT image restoration method based on central guidance and alternating optimization.
Xiaoyu ZHANG ; Hao WANG ; Dong ZENG ; Zhaoying BIAN
Journal of Southern Medical University 2025;45(4):844-852
OBJECTIVES:
We propose a low-dose CT image restoration method based on central guidance and alternating optimization (FedGP).
METHODS:
The FedGP framework revolutionizes the traditional federated learning model by adopting a structure without a fixed central server, where each institution alternatively serves as the central server. This method uses an institution-modulated CT image restoration network as the core of client-side local training. Through a federated learning approach of central guidance and alternating optimization, the central server leverages local labeled data to guide client-side network training to enhance the generalization capability of the CT imaging model across multiple institutions.
RESULTS:
In the low-dose and sparse-view CT image restoration tasks, the FedGP method showed significant advantages in both visual and quantitative evaluation and achieved the highest PSNR (40.25 and 38.84), the highest SSIM (0.95 and 0.92), and the lowest RMSE (2.39 and 2.56). Ablation study of FedGP demonstrated that compared with FedGP(w/o GP) without central guidance, the FedGP method better adapted to data heterogeneity across institutions, thus ensuring robustness and generalization capability of the model in different imaging conditions.
CONCLUSIONS
FedGP provides a more flexible FL framework to solve the problem of CT imaging heterogeneity and well adapts to multi-institutional data characteristics to improve generalization ability of the model under diverse imaging geometric configurations.
Tomography, X-Ray Computed/methods*
;
Humans
;
Radiation Dosage
;
Image Processing, Computer-Assisted/methods*
;
Algorithms
10.AConvLSTM U-Net: a multi-scale jaw cyst segmentation model based on bidirectional dense connection and attention mechanism.
Suqiang LI ; Zhouyang WANG ; Sixian CHAN ; Xiaolong ZHOU
Journal of Southern Medical University 2025;45(5):1082-1092
OBJECTIVES:
We propose a multi-scale jaw cyst segmentation model, AConvLSTM U-Net, which is based on bidirectional dense connections and attention mechanisms to achieve accurate automatic segmentation of mandibular cyst images.
METHODS:
A dataset consisting of 2592 jaw cyst images was used. AConvLSTM U-Net designs a MBC on the encoding path to enhance feature extraction capabilities. A DPD was used to connect the encoder and decoder, and a bidirectional ConvLSTM was introduced in the jump connection to obtain rich semantic information. A decoding block based on scSE was then used on the decoding path to enhance the focus on important information. Finally, a DS was designed, and the model was optimized by integrating a joint loss function to further improve the segmentation accuracy.
RESULTS:
The experiment with AConvLSTM U-Net for jaw cyst lesion segmentation showed a MCC of 93.8443%, a DSC of 93.9067%, and a JSC of 88.5133%, outperforming all the other comparison segmentation models.
CONCLUSIONS
The proposed algorithm shows a high accuracy and robustness on the jaw cyst dataset, demonstrating its superior performance over many existing methods for automatic segmentation of jaw cyst images and its potential to assist clinical diagnosis.
Humans
;
Jaw Cysts/diagnostic imaging*
;
Algorithms
;
Image Processing, Computer-Assisted/methods*
;
Neural Networks, Computer

Result Analysis
Print
Save
E-mail