1.Deep Learning Technology for Classification of Thyroid Nodules Using Multi-View Ultrasound Images: Potential Benefits and Challenges in Clinical Application
Jinyoung KIM ; Min-Hee KIM ; Dong-Jun LIM ; Hankyeol LEE ; Jae Jun LEE ; Hyuk-Sang KWON ; Mee Kyoung KIM ; Ki-Ho SONG ; Tae-Jung KIM ; So Lyung JUNG ; Yong Oh LEE ; Ki-Hyun BAEK
Endocrinology and Metabolism 2025;40(2):216-224
Background:
This study aimed to evaluate the applicability of deep learning technology to thyroid ultrasound images for classification of thyroid nodules.
Methods:
This retrospective analysis included ultrasound images of patients with thyroid nodules investigated by fine-needle aspiration at the thyroid clinic of a single center from April 2010 to September 2012. Thyroid nodules with cytopathologic results of Bethesda category V (suspicious for malignancy) or VI (malignant) were defined as thyroid cancer. Multiple deep learning algorithms based on convolutional neural networks (CNNs) —ResNet, DenseNet, and EfficientNet—were utilized, and Siamese neural networks facilitated multi-view analysis of paired transverse and longitudinal ultrasound images.
Results:
Among 1,048 analyzed thyroid nodules from 943 patients, 306 (29%) were identified as thyroid cancer. In a subgroup analysis of transverse and longitudinal images, longitudinal images showed superior prediction ability. Multi-view modeling, based on paired transverse and longitudinal images, significantly improved the model performance; with an accuracy of 0.82 (95% confidence intervals [CI], 0.80 to 0.86) with ResNet50, 0.83 (95% CI, 0.83 to 0.88) with DenseNet201, and 0.81 (95% CI, 0.79 to 0.84) with EfficientNetv2_ s. Training with high-resolution images obtained using the latest equipment tended to improve model performance in association with increased sensitivity.
Conclusion
CNN algorithms applied to ultrasound images demonstrated substantial accuracy in thyroid nodule classification, indicating their potential as valuable tools for diagnosing thyroid cancer. However, in real-world clinical settings, it is important to aware that model performance may vary depending on the quality of images acquired by different physicians and imaging devices.
2.Deep Learning Technology for Classification of Thyroid Nodules Using Multi-View Ultrasound Images: Potential Benefits and Challenges in Clinical Application
Jinyoung KIM ; Min-Hee KIM ; Dong-Jun LIM ; Hankyeol LEE ; Jae Jun LEE ; Hyuk-Sang KWON ; Mee Kyoung KIM ; Ki-Ho SONG ; Tae-Jung KIM ; So Lyung JUNG ; Yong Oh LEE ; Ki-Hyun BAEK
Endocrinology and Metabolism 2025;40(2):216-224
Background:
This study aimed to evaluate the applicability of deep learning technology to thyroid ultrasound images for classification of thyroid nodules.
Methods:
This retrospective analysis included ultrasound images of patients with thyroid nodules investigated by fine-needle aspiration at the thyroid clinic of a single center from April 2010 to September 2012. Thyroid nodules with cytopathologic results of Bethesda category V (suspicious for malignancy) or VI (malignant) were defined as thyroid cancer. Multiple deep learning algorithms based on convolutional neural networks (CNNs) —ResNet, DenseNet, and EfficientNet—were utilized, and Siamese neural networks facilitated multi-view analysis of paired transverse and longitudinal ultrasound images.
Results:
Among 1,048 analyzed thyroid nodules from 943 patients, 306 (29%) were identified as thyroid cancer. In a subgroup analysis of transverse and longitudinal images, longitudinal images showed superior prediction ability. Multi-view modeling, based on paired transverse and longitudinal images, significantly improved the model performance; with an accuracy of 0.82 (95% confidence intervals [CI], 0.80 to 0.86) with ResNet50, 0.83 (95% CI, 0.83 to 0.88) with DenseNet201, and 0.81 (95% CI, 0.79 to 0.84) with EfficientNetv2_ s. Training with high-resolution images obtained using the latest equipment tended to improve model performance in association with increased sensitivity.
Conclusion
CNN algorithms applied to ultrasound images demonstrated substantial accuracy in thyroid nodule classification, indicating their potential as valuable tools for diagnosing thyroid cancer. However, in real-world clinical settings, it is important to aware that model performance may vary depending on the quality of images acquired by different physicians and imaging devices.
3.Deep Learning Technology for Classification of Thyroid Nodules Using Multi-View Ultrasound Images: Potential Benefits and Challenges in Clinical Application
Jinyoung KIM ; Min-Hee KIM ; Dong-Jun LIM ; Hankyeol LEE ; Jae Jun LEE ; Hyuk-Sang KWON ; Mee Kyoung KIM ; Ki-Ho SONG ; Tae-Jung KIM ; So Lyung JUNG ; Yong Oh LEE ; Ki-Hyun BAEK
Endocrinology and Metabolism 2025;40(2):216-224
Background:
This study aimed to evaluate the applicability of deep learning technology to thyroid ultrasound images for classification of thyroid nodules.
Methods:
This retrospective analysis included ultrasound images of patients with thyroid nodules investigated by fine-needle aspiration at the thyroid clinic of a single center from April 2010 to September 2012. Thyroid nodules with cytopathologic results of Bethesda category V (suspicious for malignancy) or VI (malignant) were defined as thyroid cancer. Multiple deep learning algorithms based on convolutional neural networks (CNNs) —ResNet, DenseNet, and EfficientNet—were utilized, and Siamese neural networks facilitated multi-view analysis of paired transverse and longitudinal ultrasound images.
Results:
Among 1,048 analyzed thyroid nodules from 943 patients, 306 (29%) were identified as thyroid cancer. In a subgroup analysis of transverse and longitudinal images, longitudinal images showed superior prediction ability. Multi-view modeling, based on paired transverse and longitudinal images, significantly improved the model performance; with an accuracy of 0.82 (95% confidence intervals [CI], 0.80 to 0.86) with ResNet50, 0.83 (95% CI, 0.83 to 0.88) with DenseNet201, and 0.81 (95% CI, 0.79 to 0.84) with EfficientNetv2_ s. Training with high-resolution images obtained using the latest equipment tended to improve model performance in association with increased sensitivity.
Conclusion
CNN algorithms applied to ultrasound images demonstrated substantial accuracy in thyroid nodule classification, indicating their potential as valuable tools for diagnosing thyroid cancer. However, in real-world clinical settings, it is important to aware that model performance may vary depending on the quality of images acquired by different physicians and imaging devices.
4.Deep Learning Technology for Classification of Thyroid Nodules Using Multi-View Ultrasound Images: Potential Benefits and Challenges in Clinical Application
Jinyoung KIM ; Min-Hee KIM ; Dong-Jun LIM ; Hankyeol LEE ; Jae Jun LEE ; Hyuk-Sang KWON ; Mee Kyoung KIM ; Ki-Ho SONG ; Tae-Jung KIM ; So Lyung JUNG ; Yong Oh LEE ; Ki-Hyun BAEK
Endocrinology and Metabolism 2025;40(2):216-224
Background:
This study aimed to evaluate the applicability of deep learning technology to thyroid ultrasound images for classification of thyroid nodules.
Methods:
This retrospective analysis included ultrasound images of patients with thyroid nodules investigated by fine-needle aspiration at the thyroid clinic of a single center from April 2010 to September 2012. Thyroid nodules with cytopathologic results of Bethesda category V (suspicious for malignancy) or VI (malignant) were defined as thyroid cancer. Multiple deep learning algorithms based on convolutional neural networks (CNNs) —ResNet, DenseNet, and EfficientNet—were utilized, and Siamese neural networks facilitated multi-view analysis of paired transverse and longitudinal ultrasound images.
Results:
Among 1,048 analyzed thyroid nodules from 943 patients, 306 (29%) were identified as thyroid cancer. In a subgroup analysis of transverse and longitudinal images, longitudinal images showed superior prediction ability. Multi-view modeling, based on paired transverse and longitudinal images, significantly improved the model performance; with an accuracy of 0.82 (95% confidence intervals [CI], 0.80 to 0.86) with ResNet50, 0.83 (95% CI, 0.83 to 0.88) with DenseNet201, and 0.81 (95% CI, 0.79 to 0.84) with EfficientNetv2_ s. Training with high-resolution images obtained using the latest equipment tended to improve model performance in association with increased sensitivity.
Conclusion
CNN algorithms applied to ultrasound images demonstrated substantial accuracy in thyroid nodule classification, indicating their potential as valuable tools for diagnosing thyroid cancer. However, in real-world clinical settings, it is important to aware that model performance may vary depending on the quality of images acquired by different physicians and imaging devices.
5.Gut microbiome and metabolome signatures in liver cirrhosis-related complications
Satya Priya SHARMA ; Haripriya GUPTA ; Goo-Hyun KWON ; Sang Yoon LEE ; Seol Hee SONG ; Jeoung Su KIM ; Jeong Ha PARK ; Min Ju KIM ; Dong-Hoon YANG ; Hyunjoon PARK ; Sung-Min WON ; Jin-Ju JEONG ; Ki-Kwang OH ; Jung A EOM ; Kyeong Jin LEE ; Sang Jun YOON ; Young Lim HAM ; Gwang Ho BAIK ; Dong Joon KIM ; Ki Tae SUK
Clinical and Molecular Hepatology 2024;30(4):845-862
Background/Aims:
Shifts in the gut microbiota and metabolites are interrelated with liver cirrhosis progression and complications. However, causal relationships have not been evaluated comprehensively. Here, we identified complication-dependent gut microbiota and metabolic signatures in patients with liver cirrhosis.
Methods:
Microbiome taxonomic profiling was performed on 194 stool samples (52 controls and 142 cirrhosis patients) via V3-V4 16S rRNA sequencing. Next, 51 samples (17 controls and 34 cirrhosis patients) were selected for fecal metabolite profiling via gas chromatography mass spectrometry and liquid chromatography coupled to timeof-flight mass spectrometry. Correlation analyses were performed targeting the gut-microbiota, metabolites, clinical parameters, and presence of complications (varices, ascites, peritonitis, encephalopathy, hepatorenal syndrome, hepatocellular carcinoma, and deceased).
Results:
Veillonella bacteria, Ruminococcus gnavus, and Streptococcus pneumoniae are cirrhosis-related microbiotas compared with control group. Bacteroides ovatus, Clostridium symbiosum, Emergencia timonensis, Fusobacterium varium, and Hungatella_uc were associated with complications in the cirrhosis group. The areas under the receiver operating characteristic curve (AUROCs) for the diagnosis of cirrhosis, encephalopathy, hepatorenal syndrome, and deceased were 0.863, 0.733, 0.71, and 0.69, respectively. The AUROCs of mixed microbial species for the diagnosis of cirrhosis and complication were 0.808 and 0.847, respectively. According to the metabolic profile, 5 increased fecal metabolites in patients with cirrhosis were biomarkers (AUROC >0.880) for the diagnosis of cirrhosis and complications. Clinical markers were significantly correlated with the gut microbiota and metabolites.
Conclusions
Cirrhosis-dependent gut microbiota and metabolites present unique signatures that can be used as noninvasive biomarkers for the diagnosis of cirrhosis and its complications.
6.Prognostic Roles of Inflammatory Biomarkers in Radioiodine-Refractory Thyroid Cancer Treated with Lenvatinib
Chae A KIM ; Mijin KIM ; Meihua JIN ; Hee Kyung KIM ; Min Ji JEON ; Dong Jun LIM ; Bo Hyun KIM ; Ho-Cheol KANG ; Won Bae KIM ; Dong Yeob SHIN ; Won Gu KIM
Endocrinology and Metabolism 2024;39(2):334-343
Background:
Inflammatory biomarkers, such as the neutrophil-to-lymphocyte ratio (NLR), lymphocyte-to-monocyte ratio (LMR), and platelet-to-lymphocyte ratio (PLR), serve as valuable prognostic indicators in various cancers. This multicenter, retrospective cohort study assessed the treatment outcomes of lenvatinib in 71 patients with radioactive iodine (RAI)-refractory thyroid cancer, considering the baseline inflammatory biomarkers.
Methods:
This study retrospectively included patients from five tertiary hospitals in Korea whose complete blood counts were available before lenvatinib treatment. Progression-free survival (PFS) and overall survival (OS) were evaluated based on the median value of inflammatory biomarkers.
Results:
No significant differences in baseline characteristics were observed among patients grouped according to the inflammatory biomarkers, except for older patients with a higher-than-median NLR (≥2) compared to their counterparts with a lower NLR (P= 0.01). Patients with a higher-than-median NLR had significantly shorter PFS (P=0.02) and OS (P=0.017) than those with a lower NLR. In multivariate analysis, a higher-than-median NLR was significantly associated with poor OS (hazard ratio, 3.0; 95% confidence interval, 1.24 to 7.29; P=0.015). However, neither the LMR nor the PLR was associated with PFS. A higher-than-median LMR (≥3.9) was significantly associated with prolonged OS compared to a lower LMR (P=0.036). In contrast, a higher-than-median PLR (≥142.1) was associated with shorter OS compared to a lower PLR (P=0.039).
Conclusion
Baseline inflammatory biomarkers can serve as predictive indicators of PFS and OS in patients with RAI-refractory thyroid cancer treated with lenvatinib.
7.Cost-Utility Analysis of Early Detection with Ultrasonography of Differentiated Thyroid Cancer: A Retrospective Study on a Korean Population
Han-Sang BAEK ; Jeonghoon HA ; Kwangsoon KIM ; Ja Seong BAE ; Jeong Soo KIM ; Sungju KIM ; Dong-Jun LIM ; Chul-Min KIM
Endocrinology and Metabolism 2024;39(2):310-323
Background:
There is debate about ultrasonography screening for thyroid cancer and its cost-effectiveness. This study aimed to evaluate the cost-effectiveness of early screening (ES) versus symptomatic detection (SD) for differentiated thyroid cancer (DTC) in Korea.
Methods:
A Markov decision analysis model was constructed to compare the cost-effectiveness of ES and SD. The model considered direct medical costs, health outcomes, and different diagnostic and treatment pathways. Input data were derived from literature and Korean population studies. Incremental cost-effectiveness ratio (ICER) was calculated. Willingness-to-pay (WTP) threshold was set at USD 100,000 or 20,000 per quality-adjusted life year (QALY) gained. Sensitivity analyses were conducted to address uncertainties of the model’s variables.
Results:
In a base case scenario with 50 years of follow-up, ES was found to be cost-effective compared to SD, with an ICER of $2,852 per QALY. With WTP set at $100,000, in the case with follow-up less than 10 years, the SD was cost-effective. Sensitivity analysis showed that variables such as lobectomy probability, age, mortality, and utility scores significantly influenced the ICER. Despite variations in costs and other factors, all ICER values remained below the WTP threshold.
Conclusion
Findings of this study indicate that ES is a cost-effective strategy for DTC screening in the Korean medical system. Early detection and subsequent lobectomy contribute to the cost-effectiveness of ES, while SD at an advanced stage makes ES more cost-effective. Expected follow-up duration should be considered to determine an optimal strategy for DTC screening.
8.In Vitro Investigation of HIF-1α as a Therapeutic Target for Thyroid-Associated Ophthalmopathy
Jeongmin LEE ; Jinsoo LEE ; Hansang BAEK ; Dong-Jun LIM ; Seong-Beom LEE ; Jung-Min LEE ; Sang-Ah JANG ; Moo Il KANG ; Suk-Woo YANG ; Min-Hee KIM
Endocrinology and Metabolism 2024;39(5):767-776
Background:
Thyroid-associated ophthalmopathy (TAO) involves tissue expansion and inflammation, potentially causing a hypoxic microenvironment. Hypoxia-inducible factor (HIF)-1α is crucial in fibrosis and adipogenesis, which are observed in TAO progression. We investigated the effects of hypoxia on orbital fibroblasts (OFs) in TAO, focusing on the role of HIF-1α in TAO progression.
Methods:
OFs were isolated from TAO and non-TAO patients (as controls). In addition to HIF-1α, adipogenic differentiation markers including peroxisome proliferator-activated receptor γ (PPARγ) and CCAAT/enhancer binding protein (CEBP) were measured by Western blot, and phenotype changes were evaluated by Oil Red O staining under both normoxia and hypoxia. To elucidate the effect of HIF-1α inhibition, protein expression changes after HIF-1α inhibitor treatment were evaluated under normoxia and hypoxia.
Results:
TAO OFs exhibited significantly higher HIF-1α expression than non-TAO OFs, and the difference was more distinct under hypoxia than under normoxia. Oil Red O staining showed that adipogenic differentiation of TAO OFs was prominent under hypoxia. Hypoxic conditions increased the expression of adipogenic markers, namely PPARγ and CEBP, as well as HIF-1α in TAO OFs. Interleukin 6 levels also increased in response to hypoxia. The effect of hypoxia on adipogenesis was reduced at the protein level after HIF-1α inhibitor treatment, and this inhibitory effect was sustained even with IGF-1 stimulation in addition to hypoxia.
Conclusion
Hypoxia induces tissue remodeling in TAO by stimulating adipogenesis through HIF-1α activation. These data could provide insights into new treatment strategies and the mechanisms of adipose tissue remodeling in TAO.
9.ChatGPT Predicts In-Hospital All-Cause Mortality for Sepsis: In-Context Learning with the Korean Sepsis Alliance Database
Namkee OH ; Won Chul CHA ; Jun Hyuk SEO ; Seong-Gyu CHOI ; Jong Man KIM ; Chi Ryang CHUNG ; Gee Young SUH ; Su Yeon LEE ; Dong Kyu OH ; Mi Hyeon PARK ; Chae-Man LIM ; Ryoung-Eun KO ;
Healthcare Informatics Research 2024;30(3):266-276
Objectives:
Sepsis is a leading global cause of mortality, and predicting its outcomes is vital for improving patient care. This study explored the capabilities of ChatGPT, a state-of-the-art natural language processing model, in predicting in-hospital mortality for sepsis patients.
Methods:
This study utilized data from the Korean Sepsis Alliance (KSA) database, collected between 2019 and 2021, focusing on adult intensive care unit (ICU) patients and aiming to determine whether ChatGPT could predict all-cause mortality after ICU admission at 7 and 30 days. Structured prompts enabled ChatGPT to engage in in-context learning, with the number of patient examples varying from zero to six. The predictive capabilities of ChatGPT-3.5-turbo and ChatGPT-4 were then compared against a gradient boosting model (GBM) using various performance metrics.
Results:
From the KSA database, 4,786 patients formed the 7-day mortality prediction dataset, of whom 718 died, and 4,025 patients formed the 30-day dataset, with 1,368 deaths. Age and clinical markers (e.g., Sequential Organ Failure Assessment score and lactic acid levels) showed significant differences between survivors and non-survivors in both datasets. For 7-day mortality predictions, the area under the receiver operating characteristic curve (AUROC) was 0.70–0.83 for GPT-4, 0.51–0.70 for GPT-3.5, and 0.79 for GBM. The AUROC for 30-day mortality was 0.51–0.59 for GPT-4, 0.47–0.57 for GPT-3.5, and 0.76 for GBM. Zero-shot predictions using GPT-4 for mortality from ICU admission to day 30 showed AUROCs from the mid-0.60s to 0.75 for GPT-4 and mainly from 0.47 to 0.63 for GPT-3.5.
Conclusions
GPT-4 demonstrated potential in predicting short-term in-hospital mortality, although its performance varied across different evaluation metrics.
10.Comparison of Surgical Burden, Radiographic and Clinical Outcomes According to the Severity of Baseline Sagittal Imbalance in Adult Spinal Deformity Patients
Se-Jun PARK ; Jin-Sung PARK ; Dong-Ho KANG ; Hyun-Jun KIM ; Yun-Mi LIM ; Chong-Suh LEE
Neurospine 2024;21(2):721-731
Objective:
To determine the clinical impact of the baseline sagittal imbalance severity in patients with adult spinal deformity (ASD).
Methods:
We retrospectively reviewed patients who underwent ≥ 5-level fusion including the pelvis, for ASD with a ≥ 2-year follow-up. Using the Scoliosis Research Society-Schwab classification system, patients were classified into 3 groups according to the severity of the preoperative sagittal imbalance: mild, moderate, and severe. Postoperative clinical and radiographic results were compared among the 3 groups.
Results:
A total of 259 patients were finally included. There were 42, 62, and 155 patients in the mild, moderate, and severe groups, respectively. The perioperative surgical burden was greatest in the severe group. Postoperatively, this group also showed the largest pelvic incidence minus lumbar lordosis mismatch, suggesting a tendency towards undercorrection. No statistically significant differences were observed in proximal junctional kyphosis, proximal junctional failure, or rod fractures among the groups. Visual analogue scale for back pain and Scoliosis Research Society-22 scores were similar across groups. However, severe group’s last follow-up Oswestry Disability Index (ODI) scores significantly lower than those of the severe group.
Conclusion
Patients with severe sagittal imbalance were treated with more invasive surgical methods along with increased the perioperative surgical burden. All patients exhibited significant radiological and clinical improvements after surgery. However, regarding ODI, the severe group demonstrated slightly worse clinical outcomes than the other groups, probably due to relatively higher proportion of undercorrection. Therefore, more rigorous correction is necessary to achieve optimal sagittal alignment specifically in patients with severe baseline sagittal imbalance.

Result Analysis
Print
Save
E-mail