1.Diagnostic performance of a computer-aided system for tuberculosis screening in two Philippine cities
Gabrielle P. Flores ; Reiner Lorenzo J. Tamao ; Robert Neil F. Leong ; Christian Sergio M. Biglaen ; Kathleen Nicole T. Uy ; Renee Rose O. Maglente ; Marlex Jorome M. Nuguid ; Jason V. Alacap
Acta Medica Philippina 2025;59(2):33-40
BACKGROUND AND OBJECTIVES
The Philippines faces challenges in the screening of tuberculosis (TB), one of them being the shortage in the health workforce who are skilled and allowed to screen TB. Deep learning neural networks (DLNNs) have shown potential in the TB screening process utilizing chest radiographs (CXRs). However, local studies on AIbased TB screening are limited. This study evaluated qXR3.0 technology's diagnostic performance for TB screening in Filipino adults aged 15 and older. Specifically, we evaluated the specificity and sensitivity of qXR3.0 compared to radiologists' impressions and determined whether it meets the World Health Organization (WHO) standards.
METHODSA prospective cohort design was used to perform a study on comparing screening and diagnostic accuracies of qXR3.0 and two radiologist gradings in accordance with the Standards for Reporting Diagnostic Accuracy (STARD). Subjects from two clinics in Metro Manila which had qXR 3.0 seeking consultation at the time of study were invited to participate to have CXRs and sputum collected. Radiologists' and qXR3.0 readings and impressions were compared with respect to the reference standard Xpert MTB/RiF assay. Diagnostic accuracy measures were calculated.
RESULTSWith 82 participants, qXR3.0 demonstrated 100% sensitivity and 72.7% specificity with respect to the reference standard. There was a strong agreement between qXR3.0 and radiologists' readings as exhibited by the 0.7895 (between qXR 3.0 and CXRs read by at least one radiologist), 0.9362 (qXR 3.0 and CXRs read by both radiologists), and 0.9403 (qXR 3.0 and CXRs read as not suggestive of TB by at least one radiologist) concordance indices.
CONCLUSIONSqXR3.0 demonstrated high sensitivity to identify presence of TB among patients, and meets the WHO standard of at least 70% specificity for detecting true TB infection. This shows an immense potential for the tool to supplement the shortage of radiologists for TB screening in the country. Future research directions may consider larger sample sizes to confirm these findings and explore the economic value of mainstream adoption of qXR 3.0 for TB screening.
Human ; Tuberculosis ; Diagnostic Imaging ; Deep Learning
2.Diagnostic performance of a computer-aided system for tuberculosis screening in two Philippine cities
Gabrielle P. Flores ; Reiner Lorenzo J. Tamayo ; Robert Neil F. Leong ; Christian Sergio M. Biglaen ; Kathleen Nicole T. Uy ; Renee Rose O. Maglente ; Marlex Jorome M. Nugui ; Jason V. Alacap
Acta Medica Philippina 2024;58(Early Access 2024):1-8
Background and Objectives:
The Philippines faces challenges in the screening of tuberculosis (TB), one of them being the shortage in the health workforce who are skilled and allowed to screen TB. Deep learning neural networks (DLNNs) have shown potential in the TB screening process utilizing chest radiographs (CXRs). However, local studies on AIbased TB screening are limited. This study evaluated qXR3.0 technology's diagnostic performance for TB screening in Filipino adults aged 15 and older. Specifically, we evaluated the specificity and sensitivity of qXR3.0 compared to radiologists' impressions and determined whether it meets the World Health Organization (WHO) standards.
Methods:
A prospective cohort design was used to perform a study on comparing screening and diagnostic accuracies of qXR3.0 and two radiologist gradings in accordance with the Standards for Reporting Diagnostic Accuracy (STARD). Subjects from two clinics in Metro Manila which had qXR 3.0 seeking consultation at the time of study were invited to participate to have CXRs and sputum collected. Radiologists' and qXR3.0 readings and impressions were compared with respect to the reference standard Xpert MTB/RiF assay. Diagnostic accuracy measures were calculated.
Results:
With 82 participants, qXR3.0 demonstrated 100% sensitivity and 72.7% specificity with respect to the
reference standard. There was a strong agreement between qXR3.0 and radiologists' readings as exhibited by
the 0.7895 (between qXR 3.0 and CXRs read by at least one radiologist), 0.9362 (qXR 3.0 and CXRs read by both
radiologists), and 0.9403 (qXR 3.0 and CXRs read as not suggestive of TB by at least one radiologist) concordance indices.
Conclusions
qXR3.0 demonstrated high sensitivity to identify presence of TB among patients, and meets the WHO standard of at least 70% specificity for detecting true TB infection. This shows an immense potential for the tool to supplement the shortage of radiologists for TB screening in the country. Future research directions may consider larger sample sizes to confirm these findings and explore the economic value of mainstream adoption of qXR 3.0 for TB screening.
Tuberculosis
;
Diagnostic Imaging
;
Deep Learning
3.The impact of anatomic racial variations on artificial intelligence analysis of Filipino retinal fundus photographs using an image-based deep learning model
Carlo A. Kasala ; Kaye Lani Rea B. Locaylocay ; Paolo S. Silva
Philippine Journal of Ophthalmology 2024;49(2):130-137
OBJECTIVES
This study evaluated the accuracy of an artificial intelligence (AI) model in identifying retinal lesions, validated its performance on a Filipino population dataset, and evaluated the impact of dataset diversity on AI analysis accuracy.
METHODSThis cross-sectional, analytical, institutional study analyzed standardized macula-centered fundus photos taken with the Zeiss Visucam®. The AI model’s output was compared with manual readings by trained retina specialists.
RESULTSA total of 215 eyes from 109 patients were included in the study. Human graders identified 109 eyes (50.7%) with retinal abnormalities. The AI model demonstrated an overall accuracy of 73.0% (95% CI 66.6% – 78.8%) in detecting abnormal retinas, with a sensitivity of 54.1% (95% CI 44.3% – 63.7%) and specificity of 92.5% (95% CI 85.7% – 96.7%).
CONCLUSIONThe availability and sources of AI training datasets can introduce biases into AI algorithms. In our dataset, racial differences in retinal morphology, such as differences in retinal pigmentation, affected the accuracy of AI image-based analysis. More diverse datasets and external validation on different populations are needed to mitigate these biases.
Human ; Artificial Intelligence ; Deep Learning
4.SPECT-MPI for Coronary Artery Disease: A deep learning approach
Vincent Peter C. Magboo ; Ma. Sheila A. Magboo
Acta Medica Philippina 2024;58(8):67-75
Background:
Worldwide, coronary artery disease (CAD) is a leading cause of mortality and morbidity and remains to be a top health priority in many countries. A non-invasive imaging modality for diagnosis of CAD such as single photon emission computed tomography-myocardial perfusion imaging (SPECT-MPI) is usually requested by cardiologists as it displays radiotracer distribution in the heart reflecting myocardial perfusion. The interpretation of SPECT-MPI is done visually by a nuclear medicine physician and is largely dependent on his clinical experience and showing significant inter-observer variability.
Objective:
The aim of the study is to apply a deep learning approach in the classification of SPECT-MPI for perfusion abnormalities using convolutional neural networks (CNN).
Methods:
A publicly available anonymized SPECT-MPI from a machine learning repository (https://www.kaggle.com/ selcankaplan/spect-mpi) was used in this study involving 192 patients who underwent stress-test-rest Tc99m MPI. An exploratory approach of CNN hyperparameter selection to search for optimum neural network model was utilized with particular focus on various dropouts (0.2, 0.5, 0.7), batch sizes (8, 16, 32, 64), and number of dense nodes (32, 64, 128, 256). The base CNN model was also compared with the commonly used pre-trained CNNs in medical images such as VGG16, InceptionV3, DenseNet121 and ResNet50. All simulations experiments were performed in Kaggle using TensorFlow 2.6.0., Keras 2.6.0, and Python language 3.7.10.
Results:
The best performing base CNN model with parameters consisting of 0.7 dropout, batch size 8, and 32 dense nodes generated the highest normalized Matthews Correlation Coefficient at 0.909 and obtained 93.75% accuracy, 96.00% sensitivity, 96.00% precision, and 96.00% F1-score. It also obtained higher classification performance as compared to the pre-trained architectures.
Conclusions
The results suggest that deep learning approaches through the use of CNN models can be deployed by nuclear medicine physicians in their clinical practice to further augment their decision skills in the interpretation of SPECT-MPI tests. These CNN models can also be used as a dependable and valid second opinion that can aid physicians as a decision-support tool as well as serve as teaching or learning materials for the less-experienced physicians particularly those still in their training career. These highlights the clinical utility of deep learning approaches through CNN models in the practice of nuclear cardiology.
Coronary Artery Disease
;
Deep Learning
5.Deep learning-based radiomics allows for a more accurate assessment of sarcopenia as a prognostic factor in hepatocellular carcinoma.
Zhikun LIU ; Yichao WU ; Abid Ali KHAN ; L U LUN ; Jianguo WANG ; Jun CHEN ; Ningyang JIA ; Shusen ZHENG ; Xiao XU
Journal of Zhejiang University. Science. B 2024;25(1):83-90
Hepatocellular carcinoma (HCC) is one of the most common malignancies and is a major cause of cancer-related mortalities worldwide (Forner et al., 2018; He et al., 2023). Sarcopenia is a syndrome characterized by an accelerated loss of skeletal muscle (SM) mass that may be age-related or the result of malnutrition in cancer patients (Cruz-Jentoft and Sayer, 2019). Preoperative sarcopenia in HCC patients treated with hepatectomy or liver transplantation is an independent risk factor for poor survival (Voron et al., 2015; van Vugt et al., 2016). Previous studies have used various criteria to define sarcopenia, including muscle area and density. However, the lack of standardized diagnostic methods for sarcopenia limits their clinical use. In 2018, the European Working Group on Sarcopenia in Older People (EWGSOP) renewed a consensus on the definition of sarcopenia: low muscle strength, loss of muscle quantity, and poor physical performance (Cruz-Jentoft et al., 2019). Radiological imaging-based measurement of muscle quantity or mass is most commonly used to evaluate the degree of sarcopenia. The gold standard is to measure the SM and/or psoas muscle (PM) area using abdominal computed tomography (CT) at the third lumbar vertebra (L3), as it is linearly correlated to whole-body SM mass (van Vugt et al., 2016). According to a "North American Expert Opinion Statement on Sarcopenia," SM index (SMI) is the preferred measure of sarcopenia (Carey et al., 2019). The variability between morphometric muscle indexes revealed that they have different clinical relevance and are generally not applicable to broader populations (Esser et al., 2019).
Humans
;
Aged
;
Sarcopenia/diagnostic imaging*
;
Carcinoma, Hepatocellular/diagnostic imaging*
;
Muscle, Skeletal/diagnostic imaging*
;
Deep Learning
;
Prognosis
;
Radiomics
;
Liver Neoplasms/diagnostic imaging*
;
Retrospective Studies
6.Deep learning method for magnetic resonance imaging fluid-attenuated inversion recovery image synthesis.
Jianing ZHOU ; Hongyu GUO ; Hong CHEN
Journal of Biomedical Engineering 2023;40(5):903-911
Magnetic resonance imaging(MRI) can obtain multi-modal images with different contrast, which provides rich information for clinical diagnosis. However, some contrast images are not scanned or the quality of the acquired images cannot meet the diagnostic requirements due to the difficulty of patient's cooperation or the limitation of scanning conditions. Image synthesis techniques have become a method to compensate for such image deficiencies. In recent years, deep learning has been widely used in the field of MRI synthesis. In this paper, a synthesis network based on multi-modal fusion is proposed, which firstly uses a feature encoder to encode the features of multiple unimodal images separately, and then fuses the features of different modal images through a feature fusion module, and finally generates the target modal image. The similarity measure between the target image and the predicted image in the network is improved by introducing a dynamic weighted combined loss function based on the spatial domain and K-space domain. After experimental validation and quantitative comparison, the multi-modal fusion deep learning network proposed in this paper can effectively synthesize high-quality MRI fluid-attenuated inversion recovery (FLAIR) images. In summary, the method proposed in this paper can reduce MRI scanning time of the patient, as well as solve the clinical problem of missing FLAIR images or image quality that is difficult to meet diagnostic requirements.
Humans
;
Deep Learning
;
Magnetic Resonance Imaging/methods*
;
Image Processing, Computer-Assisted/methods*
7.Review on ultrasonographic diagnosis of thyroid diseases based on deep learning.
Fengyuan QI ; Min QIU ; Guohui WEI
Journal of Biomedical Engineering 2023;40(5):1027-1032
In recent years, the incidence of thyroid diseases has increased significantly and ultrasound examination is the first choice for the diagnosis of thyroid diseases. At the same time, the level of medical image analysis based on deep learning has been rapidly improved. Ultrasonic image analysis has made a series of milestone breakthroughs, and deep learning algorithms have shown strong performance in the field of medical image segmentation and classification. This article first elaborates on the application of deep learning algorithms in thyroid ultrasound image segmentation, feature extraction, and classification differentiation. Secondly, it summarizes the algorithms for deep learning processing multimodal ultrasound images. Finally, it points out the problems in thyroid ultrasound image diagnosis at the current stage and looks forward to future development directions. This study can promote the application of deep learning in clinical ultrasound image diagnosis of thyroid, and provide reference for doctors to diagnose thyroid disease.
Humans
;
Algorithms
;
Deep Learning
;
Image Processing, Computer-Assisted/methods*
;
Thyroid Diseases/diagnostic imaging*
;
Ultrasonography
8.Metal artifact reduction and clinical verification in oral and maxillofacial region based on deep learning.
Wei ZENG ; Shan Luo ZHOU ; Ji Xiang GUO ; Wei TANG
Chinese Journal of Stomatology 2023;58(6):540-546
Objective: To construct a kind of neural network for eliminating the metal artifacts in CT images by training the generative adversarial networks (GAN) model, so as to provide reference for clinical practice. Methods: The CT data of patients treated in the Department of Radiology, West China Hospital of Stomatology, Sichuan University from January 2017 to June 2022 were collected. A total of 1 000 cases of artifact-free CT data and 620 cases of metal artifact CT data were obtained, including 5 types of metal restorative materials, namely, fillings, crowns, titanium plates and screws, orthodontic brackets and metal foreign bodies. Four hundred metal artifact CT data and 1 000 artifact-free CT data were utilized for simulation synthesis, and 1 000 pairs of simulated artifacts and metal images and simulated metal images (200 pairs of each type) were constructed. Under the condition that the data of the five metal artifacts were equal, the entire data set was randomly (computer random) divided into a training set (800 pairs) and a test set (200 pairs). The former was used to train the GAN model, and the latter was used to evaluate the performance of the GAN model. The test set was evaluated quantitatively and the quantitative indexes were root-mean-square error (RMSE) and structural similarity index measure (SSIM). The trained GAN model was employed to eliminate the metal artifacts from the CT data of the remaining 220 clinical cases of metal artifact CT data, and the elimination results were evaluated by two senior attending doctors using the modified LiKert scale. Results: The RMSE values for artifact elimination of fillings, crowns, titanium plates and screws, orthodontic brackets and metal foreign bodies in test set were 0.018±0.004, 0.023±0.007, 0.015±0.003, 0.019±0.004, 0.024±0.008, respectively (F=1.29, P=0.274). The SSIM values were 0.963±0.023, 0.961±0.023, 0.965±0.013, 0.958±0.022, 0.957±0.026, respectively (F=2.22, P=0.069). The intra-group correlation coefficient of 2 evaluators was 0.972. For 220 clinical cases, the overall score of the modified LiKert scale was (3.73±1.13), indicating a satisfactory performance. The scores of modified LiKert scale for fillings, crowns, titanium plates and screws, orthodontic brackets and metal foreign bodies were (3.68±1.13), (3.67±1.16), (3.97±1.03), (3.83±1.14), (3.33±1.12), respectively (F=1.44, P=0.145). Conclusions: The metal artifact reduction GAN model constructed in this study can effectively remove the interference of metal artifacts and improve the image quality.
Humans
;
Tomography, X-Ray Computed/methods*
;
Deep Learning
;
Titanium
;
Neural Networks, Computer
;
Metals
;
Image Processing, Computer-Assisted/methods*
;
Algorithms
9.Automated diagnostic classification with lateral cephalograms based on deep learning network model.
Qiao CHANG ; Shao Feng WANG ; Fei Fei ZUO ; Fan WANG ; Bei Wen GONG ; Ya Jie WANG ; Xian Ju XIE
Chinese Journal of Stomatology 2023;58(6):547-553
Objective: To establish a comprehensive diagnostic classification model of lateral cephalograms based on artificial intelligence (AI) to provide reference for orthodontic diagnosis. Methods: A total of 2 894 lateral cephalograms were collected in Department of Orthodontics, Capital Medical University School of Stomatology from January 2015 to December 2021 to construct a data set, including 1 351 males and 1 543 females with a mean age of (26.4± 7.4) years. Firstly, 2 orthodontists (with 5 and 8 years of orthodontic experience, respectively) performed manual annotation and calculated measurement for primary classification, and then 2 senior orthodontists (with more than 20 years of orthodontic experience) verified the 8 diagnostic classifications including skeletal and dental indices. The data were randomly divided into training, validation, and test sets in the ratio of 7∶2∶1. The open source DenseNet121 was used to construct the model. The performance of the model was evaluated by classification accuracy, precision rate, sensitivity, specificity and area under the curve (AUC). Visualization of model regions of interest through class activation heatmaps. Results: The automatic classification model of lateral cephalograms was successfully established. It took 0.012 s on average to make 8 diagnoses on a lateral cephalogram. The accuracy of 5 classifications was 80%-90%, including sagittal and vertical skeletal facial pattern, mandibular growth, inclination of upper incisors, and protrusion of lower incisors. The acuracy rate of 3 classifications was 70%-80%, including maxillary growth, inclination of lower incisors and protrusion of upper incisors. The average AUC of each classification was ≥0.90. The class activation heat map of successfully classified lateral cephalograms showed that the AI model activation regions were distributed in the relevant structural regions. Conclusions: In this study, an automatic classification model for lateral cephalograms was established based on the DenseNet121 to achieve rapid classification of eight commonly used clinical diagnostic items.
Male
;
Female
;
Humans
;
Young Adult
;
Adult
;
Artificial Intelligence
;
Deep Learning
;
Cephalometry
;
Maxilla
;
Mandible/diagnostic imaging*
10.Research on multi-class orthodontic image recognition system based on deep learning network model.
Shao Feng WANG ; Xian Ju XIE ; Li ZHANG ; Qiao CHANG ; Fei Fei ZUO ; Ya Jie WANG ; Yu Xing BAI
Chinese Journal of Stomatology 2023;58(6):561-568
Objective: To develop a multi-classification orthodontic image recognition system using the SqueezeNet deep learning model for automatic classification of orthodontic image data. Methods: A total of 35 000 clinical orthodontic images were collected in the Department of Orthodontics, Capital Medical University School of Stomatology, from October to November 2020 and June to July 2021. The images were from 490 orthodontic patients with a male-to-female ratio of 49∶51 and the age range of 4 to 45 years. After data cleaning based on inclusion and exclusion criteria, the final image dataset included 17 453 face images (frontal, smiling, 90° right, 90° left, 45° right, and 45° left), 8 026 intraoral images [frontal occlusion, right occlusion, left occlusion, upper occlusal view (original and flipped), lower occlusal view (original and flipped) and coverage of occlusal relationship], 4 115 X-ray images [lateral skull X-ray from the left side, lateral skull X-ray from the right side, frontal skull X-ray, cone-beam CT (CBCT), and wrist bone X-ray] and 684 other non-orthodontic images. A labeling team composed of orthodontic doctoral students, associate professors, and professors used image labeling tools to classify the orthodontic images into 20 categories, including 6 face image categories, 8 intraoral image categories, 5 X-ray image categories, and other images. The data for each label were randomly divided into training, validation, and testing sets in an 8∶1∶1 ratio using the random function in the Python programming language. The improved SqueezeNet deep learning model was used for training, and 13 000 natural images from the ImageNet open-source dataset were used as additional non-orthodontic images for algorithm optimization of anomaly data processing. A multi-classification orthodontic image recognition system based on deep learning models was constructed. The accuracy of the orthodontic image classification was evaluated using precision, recall, F1 score, and confusion matrix based on the prediction results of the test set. The reliability of the model's image classification judgment logic was verified using the gradient-weighted class activation mapping (Grad-CAM) method to generate heat maps. Results: After data cleaning and labeling, a total of 30 278 orthodontic images were included in the dataset. The test set classification results showed that the precision, recall, and F1 scores of most classification labels were 100%, with only 5 misclassified images out of 3 047, resulting in a system accuracy of 99.84%(3 042/3 047). The precision of anomaly data processing was 100% (10 500/10 500). The heat map showed that the judgment basis of the SqueezeNet deep learning model in the image classification process was basically consistent with that of humans. Conclusions: This study developed a multi-classification orthodontic image recognition system for automatic classification of 20 types of orthodontic images based on the improved SqueezeNet deep learning model. The system exhibitted good accuracy in orthodontic image classification.
Humans
;
Male
;
Female
;
Child, Preschool
;
Child
;
Adolescent
;
Young Adult
;
Adult
;
Middle Aged
;
Deep Learning
;
Reproducibility of Results
;
Radiography
;
Algorithms
;
Cone-Beam Computed Tomography


Result Analysis
Print
Save
E-mail