1.Diagnostic performance of a computer-aided system for tuberculosis screening in two Philippine cities
Gabrielle P. Flores ; Reiner Lorenzo J. Tamao ; Robert Neil F. Leong ; Christian Sergio M. Biglaen ; Kathleen Nicole T. Uy ; Renee Rose O. Maglente ; Marlex Jorome M. Nuguid ; Jason V. Alacap
Acta Medica Philippina 2025;59(2):33-40
BACKGROUND AND OBJECTIVES
The Philippines faces challenges in the screening of tuberculosis (TB), one of them being the shortage in the health workforce who are skilled and allowed to screen TB. Deep learning neural networks (DLNNs) have shown potential in the TB screening process utilizing chest radiographs (CXRs). However, local studies on AIbased TB screening are limited. This study evaluated qXR3.0 technology's diagnostic performance for TB screening in Filipino adults aged 15 and older. Specifically, we evaluated the specificity and sensitivity of qXR3.0 compared to radiologists' impressions and determined whether it meets the World Health Organization (WHO) standards.
METHODSA prospective cohort design was used to perform a study on comparing screening and diagnostic accuracies of qXR3.0 and two radiologist gradings in accordance with the Standards for Reporting Diagnostic Accuracy (STARD). Subjects from two clinics in Metro Manila which had qXR 3.0 seeking consultation at the time of study were invited to participate to have CXRs and sputum collected. Radiologists' and qXR3.0 readings and impressions were compared with respect to the reference standard Xpert MTB/RiF assay. Diagnostic accuracy measures were calculated.
RESULTSWith 82 participants, qXR3.0 demonstrated 100% sensitivity and 72.7% specificity with respect to the reference standard. There was a strong agreement between qXR3.0 and radiologists' readings as exhibited by the 0.7895 (between qXR 3.0 and CXRs read by at least one radiologist), 0.9362 (qXR 3.0 and CXRs read by both radiologists), and 0.9403 (qXR 3.0 and CXRs read as not suggestive of TB by at least one radiologist) concordance indices.
CONCLUSIONSqXR3.0 demonstrated high sensitivity to identify presence of TB among patients, and meets the WHO standard of at least 70% specificity for detecting true TB infection. This shows an immense potential for the tool to supplement the shortage of radiologists for TB screening in the country. Future research directions may consider larger sample sizes to confirm these findings and explore the economic value of mainstream adoption of qXR 3.0 for TB screening.
Human ; Tuberculosis ; Diagnostic Imaging ; Deep Learning
2.SPECT-MPI for Coronary Artery Disease: A deep learning approach
Vincent Peter C. Magboo ; Ma. Sheila A. Magboo
Acta Medica Philippina 2024;58(8):67-75
Background:
Worldwide, coronary artery disease (CAD) is a leading cause of mortality and morbidity and remains to be a top health priority in many countries. A non-invasive imaging modality for diagnosis of CAD such as single photon emission computed tomography-myocardial perfusion imaging (SPECT-MPI) is usually requested by cardiologists as it displays radiotracer distribution in the heart reflecting myocardial perfusion. The interpretation of SPECT-MPI is done visually by a nuclear medicine physician and is largely dependent on his clinical experience and showing significant inter-observer variability.
Objective:
The aim of the study is to apply a deep learning approach in the classification of SPECT-MPI for perfusion abnormalities using convolutional neural networks (CNN).
Methods:
A publicly available anonymized SPECT-MPI from a machine learning repository (https://www.kaggle.com/ selcankaplan/spect-mpi) was used in this study involving 192 patients who underwent stress-test-rest Tc99m MPI. An exploratory approach of CNN hyperparameter selection to search for optimum neural network model was utilized with particular focus on various dropouts (0.2, 0.5, 0.7), batch sizes (8, 16, 32, 64), and number of dense nodes (32, 64, 128, 256). The base CNN model was also compared with the commonly used pre-trained CNNs in medical images such as VGG16, InceptionV3, DenseNet121 and ResNet50. All simulations experiments were performed in Kaggle using TensorFlow 2.6.0., Keras 2.6.0, and Python language 3.7.10.
Results:
The best performing base CNN model with parameters consisting of 0.7 dropout, batch size 8, and 32 dense nodes generated the highest normalized Matthews Correlation Coefficient at 0.909 and obtained 93.75% accuracy, 96.00% sensitivity, 96.00% precision, and 96.00% F1-score. It also obtained higher classification performance as compared to the pre-trained architectures.
Conclusions
The results suggest that deep learning approaches through the use of CNN models can be deployed by nuclear medicine physicians in their clinical practice to further augment their decision skills in the interpretation of SPECT-MPI tests. These CNN models can also be used as a dependable and valid second opinion that can aid physicians as a decision-support tool as well as serve as teaching or learning materials for the less-experienced physicians particularly those still in their training career. These highlights the clinical utility of deep learning approaches through CNN models in the practice of nuclear cardiology.
Coronary Artery Disease
;
Deep Learning
3.Deep learning-based radiomics allows for a more accurate assessment of sarcopenia as a prognostic factor in hepatocellular carcinoma.
Zhikun LIU ; Yichao WU ; Abid Ali KHAN ; L U LUN ; Jianguo WANG ; Jun CHEN ; Ningyang JIA ; Shusen ZHENG ; Xiao XU
Journal of Zhejiang University. Science. B 2024;25(1):83-90
Hepatocellular carcinoma (HCC) is one of the most common malignancies and is a major cause of cancer-related mortalities worldwide (Forner et al., 2018; He et al., 2023). Sarcopenia is a syndrome characterized by an accelerated loss of skeletal muscle (SM) mass that may be age-related or the result of malnutrition in cancer patients (Cruz-Jentoft and Sayer, 2019). Preoperative sarcopenia in HCC patients treated with hepatectomy or liver transplantation is an independent risk factor for poor survival (Voron et al., 2015; van Vugt et al., 2016). Previous studies have used various criteria to define sarcopenia, including muscle area and density. However, the lack of standardized diagnostic methods for sarcopenia limits their clinical use. In 2018, the European Working Group on Sarcopenia in Older People (EWGSOP) renewed a consensus on the definition of sarcopenia: low muscle strength, loss of muscle quantity, and poor physical performance (Cruz-Jentoft et al., 2019). Radiological imaging-based measurement of muscle quantity or mass is most commonly used to evaluate the degree of sarcopenia. The gold standard is to measure the SM and/or psoas muscle (PM) area using abdominal computed tomography (CT) at the third lumbar vertebra (L3), as it is linearly correlated to whole-body SM mass (van Vugt et al., 2016). According to a "North American Expert Opinion Statement on Sarcopenia," SM index (SMI) is the preferred measure of sarcopenia (Carey et al., 2019). The variability between morphometric muscle indexes revealed that they have different clinical relevance and are generally not applicable to broader populations (Esser et al., 2019).
Humans
;
Aged
;
Sarcopenia/diagnostic imaging*
;
Carcinoma, Hepatocellular/diagnostic imaging*
;
Muscle, Skeletal/diagnostic imaging*
;
Deep Learning
;
Prognosis
;
Radiomics
;
Liver Neoplasms/diagnostic imaging*
;
Retrospective Studies
4.Diagnostic performance of a computer-aided system for tuberculosis screening in two Philippine cities
Gabrielle P. Flores ; Reiner Lorenzo J. Tamayo ; Robert Neil F. Leong ; Christian Sergio M. Biglaen ; Kathleen Nicole T. Uy ; Renee Rose O. Maglente ; Marlex Jorome M. Nugui ; Jason V. Alacap
Acta Medica Philippina 2024;58(Early Access 2024):1-8
Background and Objectives:
The Philippines faces challenges in the screening of tuberculosis (TB), one of them being the shortage in the health workforce who are skilled and allowed to screen TB. Deep learning neural networks (DLNNs) have shown potential in the TB screening process utilizing chest radiographs (CXRs). However, local studies on AIbased TB screening are limited. This study evaluated qXR3.0 technology's diagnostic performance for TB screening in Filipino adults aged 15 and older. Specifically, we evaluated the specificity and sensitivity of qXR3.0 compared to radiologists' impressions and determined whether it meets the World Health Organization (WHO) standards.
Methods:
A prospective cohort design was used to perform a study on comparing screening and diagnostic accuracies of qXR3.0 and two radiologist gradings in accordance with the Standards for Reporting Diagnostic Accuracy (STARD). Subjects from two clinics in Metro Manila which had qXR 3.0 seeking consultation at the time of study were invited to participate to have CXRs and sputum collected. Radiologists' and qXR3.0 readings and impressions were compared with respect to the reference standard Xpert MTB/RiF assay. Diagnostic accuracy measures were calculated.
Results:
With 82 participants, qXR3.0 demonstrated 100% sensitivity and 72.7% specificity with respect to the
reference standard. There was a strong agreement between qXR3.0 and radiologists' readings as exhibited by
the 0.7895 (between qXR 3.0 and CXRs read by at least one radiologist), 0.9362 (qXR 3.0 and CXRs read by both
radiologists), and 0.9403 (qXR 3.0 and CXRs read as not suggestive of TB by at least one radiologist) concordance indices.
Conclusions
qXR3.0 demonstrated high sensitivity to identify presence of TB among patients, and meets the WHO standard of at least 70% specificity for detecting true TB infection. This shows an immense potential for the tool to supplement the shortage of radiologists for TB screening in the country. Future research directions may consider larger sample sizes to confirm these findings and explore the economic value of mainstream adoption of qXR 3.0 for TB screening.
Tuberculosis
;
Diagnostic Imaging
;
Deep Learning
5.The impact of anatomic racial variations on artificial intelligence analysis of Filipino retinal fundus photographs using an image-based deep learning model
Carlo A. Kasala ; Kaye Lani Rea B. Locaylocay ; Paolo S. Silva
Philippine Journal of Ophthalmology 2024;49(2):130-137
OBJECTIVES
This study evaluated the accuracy of an artificial intelligence (AI) model in identifying retinal lesions, validated its performance on a Filipino population dataset, and evaluated the impact of dataset diversity on AI analysis accuracy.
METHODSThis cross-sectional, analytical, institutional study analyzed standardized macula-centered fundus photos taken with the Zeiss Visucam®. The AI model’s output was compared with manual readings by trained retina specialists.
RESULTSA total of 215 eyes from 109 patients were included in the study. Human graders identified 109 eyes (50.7%) with retinal abnormalities. The AI model demonstrated an overall accuracy of 73.0% (95% CI 66.6% – 78.8%) in detecting abnormal retinas, with a sensitivity of 54.1% (95% CI 44.3% – 63.7%) and specificity of 92.5% (95% CI 85.7% – 96.7%).
CONCLUSIONThe availability and sources of AI training datasets can introduce biases into AI algorithms. In our dataset, racial differences in retinal morphology, such as differences in retinal pigmentation, affected the accuracy of AI image-based analysis. More diverse datasets and external validation on different populations are needed to mitigate these biases.
Human ; Artificial Intelligence ; Deep Learning
6.Metal artifact reduction and clinical verification in oral and maxillofacial region based on deep learning.
Wei ZENG ; Shan Luo ZHOU ; Ji Xiang GUO ; Wei TANG
Chinese Journal of Stomatology 2023;58(6):540-546
Objective: To construct a kind of neural network for eliminating the metal artifacts in CT images by training the generative adversarial networks (GAN) model, so as to provide reference for clinical practice. Methods: The CT data of patients treated in the Department of Radiology, West China Hospital of Stomatology, Sichuan University from January 2017 to June 2022 were collected. A total of 1 000 cases of artifact-free CT data and 620 cases of metal artifact CT data were obtained, including 5 types of metal restorative materials, namely, fillings, crowns, titanium plates and screws, orthodontic brackets and metal foreign bodies. Four hundred metal artifact CT data and 1 000 artifact-free CT data were utilized for simulation synthesis, and 1 000 pairs of simulated artifacts and metal images and simulated metal images (200 pairs of each type) were constructed. Under the condition that the data of the five metal artifacts were equal, the entire data set was randomly (computer random) divided into a training set (800 pairs) and a test set (200 pairs). The former was used to train the GAN model, and the latter was used to evaluate the performance of the GAN model. The test set was evaluated quantitatively and the quantitative indexes were root-mean-square error (RMSE) and structural similarity index measure (SSIM). The trained GAN model was employed to eliminate the metal artifacts from the CT data of the remaining 220 clinical cases of metal artifact CT data, and the elimination results were evaluated by two senior attending doctors using the modified LiKert scale. Results: The RMSE values for artifact elimination of fillings, crowns, titanium plates and screws, orthodontic brackets and metal foreign bodies in test set were 0.018±0.004, 0.023±0.007, 0.015±0.003, 0.019±0.004, 0.024±0.008, respectively (F=1.29, P=0.274). The SSIM values were 0.963±0.023, 0.961±0.023, 0.965±0.013, 0.958±0.022, 0.957±0.026, respectively (F=2.22, P=0.069). The intra-group correlation coefficient of 2 evaluators was 0.972. For 220 clinical cases, the overall score of the modified LiKert scale was (3.73±1.13), indicating a satisfactory performance. The scores of modified LiKert scale for fillings, crowns, titanium plates and screws, orthodontic brackets and metal foreign bodies were (3.68±1.13), (3.67±1.16), (3.97±1.03), (3.83±1.14), (3.33±1.12), respectively (F=1.44, P=0.145). Conclusions: The metal artifact reduction GAN model constructed in this study can effectively remove the interference of metal artifacts and improve the image quality.
Humans
;
Tomography, X-Ray Computed/methods*
;
Deep Learning
;
Titanium
;
Neural Networks, Computer
;
Metals
;
Image Processing, Computer-Assisted/methods*
;
Algorithms
7.Automated diagnostic classification with lateral cephalograms based on deep learning network model.
Qiao CHANG ; Shao Feng WANG ; Fei Fei ZUO ; Fan WANG ; Bei Wen GONG ; Ya Jie WANG ; Xian Ju XIE
Chinese Journal of Stomatology 2023;58(6):547-553
Objective: To establish a comprehensive diagnostic classification model of lateral cephalograms based on artificial intelligence (AI) to provide reference for orthodontic diagnosis. Methods: A total of 2 894 lateral cephalograms were collected in Department of Orthodontics, Capital Medical University School of Stomatology from January 2015 to December 2021 to construct a data set, including 1 351 males and 1 543 females with a mean age of (26.4± 7.4) years. Firstly, 2 orthodontists (with 5 and 8 years of orthodontic experience, respectively) performed manual annotation and calculated measurement for primary classification, and then 2 senior orthodontists (with more than 20 years of orthodontic experience) verified the 8 diagnostic classifications including skeletal and dental indices. The data were randomly divided into training, validation, and test sets in the ratio of 7∶2∶1. The open source DenseNet121 was used to construct the model. The performance of the model was evaluated by classification accuracy, precision rate, sensitivity, specificity and area under the curve (AUC). Visualization of model regions of interest through class activation heatmaps. Results: The automatic classification model of lateral cephalograms was successfully established. It took 0.012 s on average to make 8 diagnoses on a lateral cephalogram. The accuracy of 5 classifications was 80%-90%, including sagittal and vertical skeletal facial pattern, mandibular growth, inclination of upper incisors, and protrusion of lower incisors. The acuracy rate of 3 classifications was 70%-80%, including maxillary growth, inclination of lower incisors and protrusion of upper incisors. The average AUC of each classification was ≥0.90. The class activation heat map of successfully classified lateral cephalograms showed that the AI model activation regions were distributed in the relevant structural regions. Conclusions: In this study, an automatic classification model for lateral cephalograms was established based on the DenseNet121 to achieve rapid classification of eight commonly used clinical diagnostic items.
Male
;
Female
;
Humans
;
Young Adult
;
Adult
;
Artificial Intelligence
;
Deep Learning
;
Cephalometry
;
Maxilla
;
Mandible/diagnostic imaging*
8.Research on multi-class orthodontic image recognition system based on deep learning network model.
Shao Feng WANG ; Xian Ju XIE ; Li ZHANG ; Qiao CHANG ; Fei Fei ZUO ; Ya Jie WANG ; Yu Xing BAI
Chinese Journal of Stomatology 2023;58(6):561-568
Objective: To develop a multi-classification orthodontic image recognition system using the SqueezeNet deep learning model for automatic classification of orthodontic image data. Methods: A total of 35 000 clinical orthodontic images were collected in the Department of Orthodontics, Capital Medical University School of Stomatology, from October to November 2020 and June to July 2021. The images were from 490 orthodontic patients with a male-to-female ratio of 49∶51 and the age range of 4 to 45 years. After data cleaning based on inclusion and exclusion criteria, the final image dataset included 17 453 face images (frontal, smiling, 90° right, 90° left, 45° right, and 45° left), 8 026 intraoral images [frontal occlusion, right occlusion, left occlusion, upper occlusal view (original and flipped), lower occlusal view (original and flipped) and coverage of occlusal relationship], 4 115 X-ray images [lateral skull X-ray from the left side, lateral skull X-ray from the right side, frontal skull X-ray, cone-beam CT (CBCT), and wrist bone X-ray] and 684 other non-orthodontic images. A labeling team composed of orthodontic doctoral students, associate professors, and professors used image labeling tools to classify the orthodontic images into 20 categories, including 6 face image categories, 8 intraoral image categories, 5 X-ray image categories, and other images. The data for each label were randomly divided into training, validation, and testing sets in an 8∶1∶1 ratio using the random function in the Python programming language. The improved SqueezeNet deep learning model was used for training, and 13 000 natural images from the ImageNet open-source dataset were used as additional non-orthodontic images for algorithm optimization of anomaly data processing. A multi-classification orthodontic image recognition system based on deep learning models was constructed. The accuracy of the orthodontic image classification was evaluated using precision, recall, F1 score, and confusion matrix based on the prediction results of the test set. The reliability of the model's image classification judgment logic was verified using the gradient-weighted class activation mapping (Grad-CAM) method to generate heat maps. Results: After data cleaning and labeling, a total of 30 278 orthodontic images were included in the dataset. The test set classification results showed that the precision, recall, and F1 scores of most classification labels were 100%, with only 5 misclassified images out of 3 047, resulting in a system accuracy of 99.84%(3 042/3 047). The precision of anomaly data processing was 100% (10 500/10 500). The heat map showed that the judgment basis of the SqueezeNet deep learning model in the image classification process was basically consistent with that of humans. Conclusions: This study developed a multi-classification orthodontic image recognition system for automatic classification of 20 types of orthodontic images based on the improved SqueezeNet deep learning model. The system exhibitted good accuracy in orthodontic image classification.
Humans
;
Male
;
Female
;
Child, Preschool
;
Child
;
Adolescent
;
Young Adult
;
Adult
;
Middle Aged
;
Deep Learning
;
Reproducibility of Results
;
Radiography
;
Algorithms
;
Cone-Beam Computed Tomography
9.Research status and outlook of deep learning in oral and maxillofacial medical imaging.
Chinese Journal of Stomatology 2023;58(6):533-539
Artificial intelligence, represented by deep learning, has received increasing attention in the field of oral and maxillofacial medical imaging, which has been widely studied in image analysis and image quality improvement. This narrative review provides an insight into the following applications of deep learning in oral and maxillofacial imaging: detection, recognition and segmentation of teeth and other anatomical structures, detection and diagnosis of oral and maxillofacial diseases, and forensic personal identification. In addition, the limitations of the studies and the directions for future development are summarized.
Artificial Intelligence
;
Deep Learning
;
Diagnostic Imaging
;
Radiography
;
Image Processing, Computer-Assisted
10.Application of Deep Learning in Differential Diagnosis of Ameloblastoma and Odontogenic Keratocyst Based on Panoramic Radiographs.
Min LI ; Chuang-Chuang MU ; Jian-Yun ZHANG ; Gang LI
Acta Academiae Medicinae Sinicae 2023;45(2):273-279
Objective To evaluate the accuracy of different convolutional neural networks (CNN),representative deep learning models,in the differential diagnosis of ameloblastoma and odontogenic keratocyst,and subsequently compare the diagnosis results between models and oral radiologists. Methods A total of 1000 digital panoramic radiographs were retrospectively collected from the patients with ameloblastoma (500 radiographs) or odontogenic keratocyst (500 radiographs) in the Department of Oral and Maxillofacial Radiology,Peking University School of Stomatology.Eight CNN including ResNet (18,50,101),VGG (16,19),and EfficientNet (b1,b3,b5) were selected to distinguish ameloblastoma from odontogenic keratocyst.Transfer learning was employed to train 800 panoramic radiographs in the training set through 5-fold cross validation,and 200 panoramic radiographs in the test set were used for differential diagnosis.Chi square test was performed for comparing the performance among different CNN.Furthermore,7 oral radiologists (including 2 seniors and 5 juniors) made a diagnosis on the 200 panoramic radiographs in the test set,and the diagnosis results were compared between CNN and oral radiologists. Results The eight neural network models showed the diagnostic accuracy ranging from 82.50% to 87.50%,of which EfficientNet b1 had the highest accuracy of 87.50%.There was no significant difference in the diagnostic accuracy among the CNN models (P=0.998,P=0.905).The average diagnostic accuracy of oral radiologists was (70.30±5.48)%,and there was no statistical difference in the accuracy between senior and junior oral radiologists (P=0.883).The diagnostic accuracy of CNN models was higher than that of oral radiologists (P<0.001). Conclusion Deep learning CNN can realize accurate differential diagnosis between ameloblastoma and odontogenic keratocyst with panoramic radiographs,with higher diagnostic accuracy than oral radiologists.
Humans
;
Ameloblastoma/diagnostic imaging*
;
Deep Learning
;
Diagnosis, Differential
;
Radiography, Panoramic
;
Retrospective Studies
;
Odontogenic Cysts/diagnostic imaging*
;
Odontogenic Tumors


Result Analysis
Print
Save
E-mail