1.A machine learning approach for the diagnosis of obstructive sleep apnoea using oximetry, demographic and anthropometric data.
Zhou Hao LEONG ; Shaun Ray Han LOH ; Leong Chai LEOW ; Thun How ONG ; Song Tar TOH
Singapore medical journal 2025;66(4):195-201
INTRODUCTION:
Obstructive sleep apnoea (OSA) is a serious but underdiagnosed condition. Demand for the gold standard diagnostic polysomnogram (PSG) far exceeds its availability. More efficient diagnostic methods are needed, even in tertiary settings. Machine learning (ML) models have strengths in disease prediction and early diagnosis. We explored the use of ML with oximetry, demographic and anthropometric data to diagnose OSA.
METHODS:
A total of 2,996 patients were included for modelling and divided into test and training sets. Seven commonly used supervised learning algorithms were trained with the data. Sensitivity (recall), specificity, positive predictive value (PPV) (precision), negative predictive value, area under the receiver operating characteristic curve (AUC) and F1 measure were reported for each model.
RESULTS:
In the best performing four-class model (neural network model predicting no, mild, moderate or severe OSA), a prediction of moderate and/or severe disease had a combined PPV of 94%; one out of 335 patients had no OSA and 19 had mild OSA. In the best performing two-class model (logistic regression model predicting no-mild vs. moderate-severe OSA), the PPV for moderate-severe OSA was 92%; two out of 350 patients had no OSA and 26 had mild OSA.
CONCLUSION
Our study showed that the prediction of moderate-severe OSA in a tertiary setting with an ML approach is a viable option to facilitate early identification of OSA. Prospective studies with home-based oximeters and analysis of other oximetry variables are the next steps towards formal implementation.
Humans
;
Oximetry/methods*
;
Sleep Apnea, Obstructive/diagnosis*
;
Male
;
Female
;
Middle Aged
;
Machine Learning
;
Polysomnography
;
Adult
;
Anthropometry
;
ROC Curve
;
Aged
;
Algorithms
;
Predictive Value of Tests
;
Sensitivity and Specificity
;
Neural Networks, Computer
;
Demography
2.Use of deep learning model for paediatric elbow radiograph binomial classification: initial experience, performance and lessons learnt.
Mark Bangwei TAN ; Yuezhi Russ CHUA ; Qiao FAN ; Marielle Valerie FORTIER ; Peiqi Pearlly CHANG
Singapore medical journal 2025;66(4):208-214
INTRODUCTION:
In this study, we aimed to compare the performance of a convolutional neural network (CNN)-based deep learning model that was trained on a dataset of normal and abnormal paediatric elbow radiographs with that of paediatric emergency department (ED) physicians on a binomial classification task.
METHODS:
A total of 1,314 paediatric elbow lateral radiographs (patient mean age 8.2 years) were retrospectively retrieved and classified based on annotation as normal or abnormal (with pathology). They were then randomly partitioned to a development set (993 images); first and second tuning (validation) sets (109 and 100 images, respectively); and a test set (112 images). An artificial intelligence (AI) model was trained on the development set using the EfficientNet B1 network architecture. Its performance on the test set was compared to that of five physicians (inter-rater agreement: fair). Performance of the AI model and the physician group was tested using McNemar test.
RESULTS:
The accuracy of the AI model on the test set was 80.4% (95% confidence interval [CI] 71.8%-87.3%), and the area under the receiver operating characteristic curve (AUROC) was 0.872 (95% CI 0.831-0.947). The performance of the AI model vs. the physician group on the test set was: sensitivity 79.0% (95% CI: 68.4%-89.5%) vs. 64.9% (95% CI: 52.5%-77.3%; P = 0.088); and specificity 81.8% (95% CI: 71.6%-92.0%) vs. 87.3% (95% CI: 78.5%-96.1%; P = 0.439).
CONCLUSION
The AI model showed good AUROC values and higher sensitivity, with the P-value at nominal significance when compared to the clinician group.
Humans
;
Deep Learning
;
Child
;
Retrospective Studies
;
Male
;
Female
;
Radiography/methods*
;
ROC Curve
;
Elbow/diagnostic imaging*
;
Neural Networks, Computer
;
Child, Preschool
;
Elbow Joint/diagnostic imaging*
;
Emergency Service, Hospital
;
Adolescent
;
Infant
;
Artificial Intelligence
3.Development of an abdominal acupoint localization system based on AI deep learning.
Mo ZHANG ; Yuming LI ; Zongming SHI
Chinese Acupuncture & Moxibustion 2025;45(3):391-396
This study aims to develop an abdominal acupoint localization system based on computer vision and convolutional neural networks (CNNs). To address the challenge of abdominal acupoint localization, a multi-task CNNs architecture was constructed and trained to locate the Shenque (CV8) and human body boundaries. Based on the identified Shenque (CV8), the system further deduces key characteristics of four acupoints: Shangwan (CV13), Qugu (CV2), and bilateral Daheng (SP15). An affine transformation matrix is applied to accurately map image coordinates to an acupoint template space, achieving precise localization of abdominal acupoints. Testing has verified that this system can accurately identify and locate abdominal acupoints in images. The development of this localization system provides technical support for TCM remote education, diagnostic assistance, and advanced TCM equipment, such as intelligent acupuncture robots, facilitating the standardization and intelligent advancement of acupuncture.
Acupuncture Points
;
Humans
;
Deep Learning
;
Abdomen/diagnostic imaging*
;
Neural Networks, Computer
;
Acupuncture Therapy
;
Image Processing, Computer-Assisted
4.Construction of an artificial intelligence-assisted system for auxiliary detection of auricular point features based on the YOLO neural network.
Ganhong WANG ; Zihao ZHANG ; Kaijian XIA ; Yanting ZHOU ; Meijuan XI ; Jian CHEN
Chinese Acupuncture & Moxibustion 2025;45(4):413-420
OBJECTIVE:
To develop an artificial intelligence-assisted system for the automatic detection of the features of common 21 auricular points based on the YOLOv8 neural network.
METHODS:
A total of 660 human auricular images from three research centers were collected from June 2019 to February 2024. The rectangle boxes and features of images were annotated using the LabelMe5.3.1 tool and converted them into a format compatible with the YOLO model. Using these data, transfer learning and fine-tuning training were conducted on different scales of pretrained YOLO neural network models. The model's performance was evaluated on validation and test sets, including the mean average precision (mAP) at various thresholds, recall rate (recall), frames per second (FPS) and confusion matrices. Finally, the model was deployed on a local computer, and the real-time detection of human auricular images was conducted using a camera.
RESULTS:
Five different versions of the YOLOv8 key-point detection model were developed, including YOLOv8n, YOLOv8s, YOLOv8m, YOLOv8l, and YOLOv8x. On the validation set, YOLOv8n showed the best performance in terms of speed (225.736 frames per second) and precision (0.998). On the external test set, YOLOv8n achieved the accuracy of 0.991, the sensitivity of 1.0, and the F1 score of 0.995. The localization performance of auricular point features showed the average accuracy of 0.990, the precision of 0.995, and the recall of 0.997 under 50% intersection ration (mAP50).
CONCLUSION
The key-point detection model of 21 common auricular points based on YOLOv8n exhibits the excellent predictive performance, which is capable of rapidly and automatically locating and classifying auricular points.
Humans
;
Neural Networks, Computer
;
Artificial Intelligence
;
Acupuncture Points
5.Research status of automatic localization of acupoint based on deep learning.
Yuge DONG ; Chengbin WANG ; Weigang MA ; Weifang GAO ; Yuzi TANG ; Yonglong ZHANG ; Jiwen QIU ; Haiyan REN ; Zhongzheng LI ; Tianyi ZHAO ; Zhongxi LV ; Xingfang PAN
Chinese Acupuncture & Moxibustion 2025;45(5):586-592
This paper reviews the published articles of recent years on the application of deep learning methods in automatic localization of acupoint, and summarizes it from 3 key links, i.e. the dataset construction, the neural network model design, and the accuracy evaluation of acupoint localization. The significant progress has been obtained in the field of deep learning for acupoint localization, but the scale of acupoint detection needs to be expanded and the precision, the generalization ability, and the real-time performance of the model be advanced. The future research should focus on the support of standardized datasets, and the integration of 3D modeling and multimodal data fusion, so as to increase the accuracy and strengthen the personalization of acupoint localization.
Deep Learning
;
Acupuncture Points
;
Humans
;
Neural Networks, Computer
6.An efficient and lightweight skin pathology detection method based on multi-scale feature fusion using an improved RT-DETR model.
Yuying REN ; Lingxiao HUANG ; Fang DU ; Xinbo YAO
Journal of Southern Medical University 2025;45(2):409-421
OBJECTIVES:
The presence of multi-scale skin lesion regions and image noise interference and limited resources of auxiliary diagnostic equipment affect the accuracy of skin disease detection in skin disease detection tasks. To solve these problems, we propose a highly efficient and lightweight skin disease detection model using an improved RT-DETR model.
METHODS:
A lightweight FasterNet was introduced as the backbone network and the FasterNetBlock module was parametrically refined. A Convolutional and Attention Fusion Module (CAFM) was used to replace the multi-head self-attention mechanism in the neck network to enhance the ability of the AIFI-CAFM module for capturing global dependencies and local detail information. The DRB-HSFPN feature pyramid network was designed to replace the Cross-Scale Feature Fusion Module (CCFM) to allow the integration of contextual information across different scales to improve the semantic feature expression capacity of the neck network. Finally, combining the advantages of Inner-IoU and EIoU, the Inner-EIoU was used to replace the original loss function GIOU to further enhance the model's inference accuracy and convergence speed.
RESULTS:
The experimental results on the HAM10000 dataset showed that the improved RT-DETR model, as compared with the original model, had increased mAP@50 and mAP@50:95 by 4.5% and 2.8%, respectively, with a detection speed of 59.1 frames per second (FPS). The improved model had a parameter count of 10.9 M and a computational load of 19.3 GFLOPs, which were reduced by 46.0% and 67.2% compared to those of the original model, validating the effectiveness of the improved model.
CONCLUSIONS
The proposed SD-DETR model significantly improves the performance of skin disease detection tasks by effectively extracting and integrating multi-scale features while reducing both parameter count and computational load.
Humans
;
Skin Diseases/diagnosis*
;
Skin/pathology*
;
Neural Networks, Computer
;
Algorithms
7.A multi-scale supervision and residual feedback optimization algorithm for improving optic chiasm and optic nerve segmentation accuracy in nasopharyngeal carcinoma CT images.
Jinyu LIU ; Shujun LIANG ; Yu ZHANG
Journal of Southern Medical University 2025;45(3):632-642
OBJECTIVES:
We propose a novel deep learning segmentation algorithm (DSRF) based on multi-scale supervision and residual feedback strategy for precise segmentation of the optic chiasm and optic nerves in CT images of nasopharyngeal carcinoma (NPC) patients.
METHODS:
We collected 212 NPC CT images and their ground truth labels from SegRap2023, StructSeg2019 and HaN-Seg2023 datasets. Based on a hybrid pooling strategy, we designed a decoder (HPS) to reduce small organ feature loss during pooling in convolutional neural networks. This decoder uses adaptive and average pooling to refine high-level semantic features, which are integrated with primary semantic features to enable network learning of finer feature details. We employed multi-scale deep supervision layers to learn rich multi-scale and multi-level semantic features under deep supervision, thereby enhancing boundary identification of the optic chiasm and optic nerves. A residual feedback module that enables multiple iterations of the network was designed for contrast enhancement of the optic chiasm and optic nerves in CT images by utilizing information from fuzzy boundaries and easily confused regions to iteratively refine segmentation results under supervision. The entire segmentation framework was optimized with the loss from each iteration to enhance segmentation accuracy and boundary clarity. Ablation experiments and comparative experiments were conducted to evaluate the effectiveness of each component and the performance of the proposed model.
RESULTS:
The DSRF algorithm could effectively enhance feature representation of small organs to achieve accurate segmentation of the optic chiasm and optic nerves with an average DSC of 0.837 and an ASSD of 0.351. Ablation experiments further verified the contributions of each component in the DSRF method.
CONCLUSIONS
The proposed deep learning segmentation algorithm can effectively enhance feature representation to achieve accurate segmentation of the optic chiasm and optic nerves in CT images of NPC.
Humans
;
Tomography, X-Ray Computed/methods*
;
Optic Chiasm/diagnostic imaging*
;
Optic Nerve/diagnostic imaging*
;
Algorithms
;
Nasopharyngeal Carcinoma
;
Deep Learning
;
Nasopharyngeal Neoplasms/diagnostic imaging*
;
Neural Networks, Computer
;
Image Processing, Computer-Assisted/methods*
8.A lightweight classification network for single-lead atrial fibrillation based on depthwise separable convolution and attention mechanism.
Yong HONG ; Xin ZHANG ; Mingjun LIN ; Qiucen WU ; Chaomin CHEN
Journal of Southern Medical University 2025;45(3):650-660
OBJECTIVES:
To design a deep learning model that balances model complexity and performance to enable its integration into wearable ECG monitoring devices for automated diagnosis of atrial fibrillation.
METHODS:
This study was performed based on data from 84 patients with atrial fibrillation, 25 patients with atrial fibrillation, and 18 subjects without obvious arrhythmia collected from the publicly available datasets LTAFDB, AFDB, and NSRDB, respectively. A lightweight attention network based on depthwise separable convolution and fusion of channel-spatial information, namely DSC-AttNet, was proposed. Depthwise separable convolution was introduced to replace standard convolution and reduce model parameters and computational complexity to realize high efficiency and light weight of the model. The multilayer hybrid attention mechanism was embedded to compute the attentional weights of the channels and spatial information at different scales to improve the feature expression ability of the model. Ten-fold cross-validation was performed on LTAFDB, and external independent testing was conducted on AFDB and NSRDB datasets.
RESULTS:
DSC-AttNet achieved a ten-fold average accuracy of 97.33% and a precision of 97.30% on the test set, both of which outperformed the other 4 comparison models as well as the 3 classical models. The accuracy of the model on the external test set reached 92.78%, better than those of the 3 classical models. The number of parameters of DSC-AttNet was 1.01M, and the computational volume was 27.19G, both smaller than the 3 classical models.
CONCLUSIONS
This proposed method has a smaller complexity, achieves better classification performance, and has a better generalization ability for atrial fibrillation classification.
Atrial Fibrillation/diagnosis*
;
Humans
;
Electrocardiography
;
Deep Learning
;
Wearable Electronic Devices
;
Neural Networks, Computer
9.Construction of recognition models for subthreshold depression based on multiple machine learning algorithms and vocal emotional characteristics.
Meimei CHEN ; Yang WANG ; Huangwei LEI ; Fei ZHANG ; Ruina HUANG ; Zhaoyang YANG
Journal of Southern Medical University 2025;45(4):711-717
OBJECTIVES:
To construct vocal recognition classification models using 6 machine learning algorithms and vocal emotional characteristics of individuals with subthreshold depression to facilitate early identification of subthreshold depression.
METHODS:
We collected voice data from both normal individuals and participants with subthreshold depression by asking them to read specifically chosen words and texts. From each voice sample, 384-dimensional vocal emotional feature variables were extracted, including energy feature, Meir frequency cepstrum coefficient, zero cross rate feature, sound probability feature, fundamental frequency feature, difference feature. The Recursive Feature Elimination (RFE) method was employed to select voice feature variables. Classification models were then built using the machine learning algorithms Adaptive Boosting (AdaBoost), Random Forest (RF), Linear Discriminant Analysis (LDA), Logistic Regression (LR), Lasso Regression (LRLasso), and Support Vector Machine (SVM), and the performance of these models was evaluated. To assess generalization capability of the models, we used real-world speech data to evaluate the best speech recognition classification model.
RESULTS:
The AdaBoost, RF, and LDA models achieved high prediction accuracies of 100%, 100%, and 93.3% on word-reading speech test set, respectively. In the text-reading speech test set, the accuracies of the AdaBoost, RF, and LDA models were 90%, 80%, and 90%, respectively, while the accuracies of the other 3 models were all below 80%. On real-world word-reading and text-reading speech data, the classification models using AdaBoost and Random Forest still achieved high predictive accuracies (91.7% and 80.6% for AdaBoost and 86.1% and 77.8% for Random, respectively).
CONCLUSIONS
Analyzing vocal emotional characteristics allows effective identification of individuals with subthreshold depression. The AdaBoost and RF models show excellent performance for classifying subthreshold depression individuals, and may thus potentially offer valuable assistance in the clinical and research settings.
Humans
;
Machine Learning
;
Emotions
;
Depression/diagnosis*
;
Algorithms
;
Voice
;
Support Vector Machine
;
Male
;
Female
10.AConvLSTM U-Net: a multi-scale jaw cyst segmentation model based on bidirectional dense connection and attention mechanism.
Suqiang LI ; Zhouyang WANG ; Sixian CHAN ; Xiaolong ZHOU
Journal of Southern Medical University 2025;45(5):1082-1092
OBJECTIVES:
We propose a multi-scale jaw cyst segmentation model, AConvLSTM U-Net, which is based on bidirectional dense connections and attention mechanisms to achieve accurate automatic segmentation of mandibular cyst images.
METHODS:
A dataset consisting of 2592 jaw cyst images was used. AConvLSTM U-Net designs a MBC on the encoding path to enhance feature extraction capabilities. A DPD was used to connect the encoder and decoder, and a bidirectional ConvLSTM was introduced in the jump connection to obtain rich semantic information. A decoding block based on scSE was then used on the decoding path to enhance the focus on important information. Finally, a DS was designed, and the model was optimized by integrating a joint loss function to further improve the segmentation accuracy.
RESULTS:
The experiment with AConvLSTM U-Net for jaw cyst lesion segmentation showed a MCC of 93.8443%, a DSC of 93.9067%, and a JSC of 88.5133%, outperforming all the other comparison segmentation models.
CONCLUSIONS
The proposed algorithm shows a high accuracy and robustness on the jaw cyst dataset, demonstrating its superior performance over many existing methods for automatic segmentation of jaw cyst images and its potential to assist clinical diagnosis.
Humans
;
Jaw Cysts/diagnostic imaging*
;
Algorithms
;
Image Processing, Computer-Assisted/methods*
;
Neural Networks, Computer

Result Analysis
Print
Save
E-mail