1.Adherence of Studies on Large Language Models for Medical Applications Published in Leading Medical Journals According to the MI-CLEAR-LLM Checklist
Ji Su KO ; Hwon HEO ; Chong Hyun SUH ; Jeho YI ; Woo Hyun SHIM
Korean Journal of Radiology 2025;26(4):304-312
Objective:
To evaluate the adherence of large language model (LLM)-based healthcare research to the Minimum Reporting Items for Clear Evaluation of Accuracy Reports of Large Language Models in Healthcare (MI-CLEAR-LLM) checklist, a framework designed to enhance the transparency and reproducibility of studies on the accuracy of LLMs for medical applications.
Materials and Methods:
A systematic PubMed search was conducted to identify articles on LLM performance published in high-ranking clinical medicine journals (the top 10% in each of the 59 specialties according to the 2023 Journal Impact Factor) from November 30, 2022, through June 25, 2024. Data on the six MI-CLEAR-LLM checklist items: 1) identification and specification of the LLM used, 2) stochasticity handling, 3) prompt wording and syntax, 4) prompt structuring, 5) prompt testing and optimization, and 6) independence of the test data—were independently extracted by two reviewers, and adherence was calculated for each item.
Results:
Of 159 studies, 100% (159/159) reported the name of the LLM, 96.9% (154/159) reported the version, and 91.8% (146/159) reported the manufacturer. However, only 54.1% (86/159) reported the training data cutoff date, 6.3% (10/159) documented access to web-based information, and 50.9% (81/159) provided the date of the query attempts. Clear documentation regarding stochasticity management was provided in 15.1% (24/159) of the studies. Regarding prompt details, 49.1% (78/159) provided exact prompt wording and syntax but only 34.0% (54/159) documented prompt-structuring practices. While 46.5% (74/159) of the studies detailed prompt testing, only 15.7% (25/159) explained the rationale for specific word choices. Test data independence was reported for only 13.2% (21/159) of the studies, and 56.6% (43/76) provided URLs for internet-sourced test data.
Conclusion
Although basic LLM identification details were relatively well reported, other key aspects, including stochasticity, prompts, and test data, were frequently underreported. Enhancing adherence to the MI-CLEAR-LLM checklist will allow LLM research to achieve greater transparency and will foster more credible and reliable future studies.
2.Association of Rapidly Elevated Plasma Tau Protein With Cognitive Decline in Patients With Amnestic Mild Cognitive Impairment and Alzheimer’s Disease
Che-Sheng CHU ; Yu-Kai LIN ; Chia-Lin TSAI ; Yueh-Feng SUNG ; Chia-Kuang TSAI ; Guan-Yu LIN ; Chien-An KO ; Yi LIU ; Chih-Sung LIANG ; Fu-Chi YANG
Psychiatry Investigation 2025;22(2):130-139
Objective:
Whether elevation in plasma levels of amyloid and tau protein biomarkers are better indicators of cognitive decline than higher baseline levels in patients with amnestic mild cognitive impairment (aMCI) and Alzheimer’s disease (AD) remains understudied.
Methods:
We included 67 participants with twice testing for AD-related plasma biomarkers via immunomagnetic reduction (IMR) assays (amyloid beta [Aβ]1-40, Aβ1-42, total tau [t-Tau], phosphorylated tau [p-Tau] 181, and alpha-synuclein [α-Syn]) and the Mini-Mental State Examination (MMSE) over a 1-year interval. We examined the correlation between biomarker levels (baseline vs. longitudinal change) and annual changes in the MMSE scores. Receiver operating characteristic curve analysis was conducted to compare the biomarkers.
Results:
After adjustment, faster cognitive decline was correlated with lower baseline levels of t-Tau (β=0.332, p=0.030) and p-Tau 181 (β=0.369, p=0.015) and rapid elevation of t-Tau (β=-0.330, p=0.030) and p-Tau 181 levels (β=-0.431, p=0.004). However, the levels (baseline and longitudinal changes) of Aβ1-40, Aβ1-42, and α-Syn were not correlated with cognitive decline. aMCI converters had lower baseline levels of p-Tau 181 (p=0.002) but larger annual changes (p=0.001) than aMCI non-converters. The change in p-Tau 181 levels showed better discriminatory capacity than the change in t-Tau levels in terms of identifying AD conversion in patients with aMCI, with an area under curve of 86.7% versus 72.2%.
Conclusion
We found changes in p-Tau 181 levels may be a suitable biomarker for identifying AD conversion.
3.Adherence of Studies on Large Language Models for Medical Applications Published in Leading Medical Journals According to the MI-CLEAR-LLM Checklist
Ji Su KO ; Hwon HEO ; Chong Hyun SUH ; Jeho YI ; Woo Hyun SHIM
Korean Journal of Radiology 2025;26(4):304-312
Objective:
To evaluate the adherence of large language model (LLM)-based healthcare research to the Minimum Reporting Items for Clear Evaluation of Accuracy Reports of Large Language Models in Healthcare (MI-CLEAR-LLM) checklist, a framework designed to enhance the transparency and reproducibility of studies on the accuracy of LLMs for medical applications.
Materials and Methods:
A systematic PubMed search was conducted to identify articles on LLM performance published in high-ranking clinical medicine journals (the top 10% in each of the 59 specialties according to the 2023 Journal Impact Factor) from November 30, 2022, through June 25, 2024. Data on the six MI-CLEAR-LLM checklist items: 1) identification and specification of the LLM used, 2) stochasticity handling, 3) prompt wording and syntax, 4) prompt structuring, 5) prompt testing and optimization, and 6) independence of the test data—were independently extracted by two reviewers, and adherence was calculated for each item.
Results:
Of 159 studies, 100% (159/159) reported the name of the LLM, 96.9% (154/159) reported the version, and 91.8% (146/159) reported the manufacturer. However, only 54.1% (86/159) reported the training data cutoff date, 6.3% (10/159) documented access to web-based information, and 50.9% (81/159) provided the date of the query attempts. Clear documentation regarding stochasticity management was provided in 15.1% (24/159) of the studies. Regarding prompt details, 49.1% (78/159) provided exact prompt wording and syntax but only 34.0% (54/159) documented prompt-structuring practices. While 46.5% (74/159) of the studies detailed prompt testing, only 15.7% (25/159) explained the rationale for specific word choices. Test data independence was reported for only 13.2% (21/159) of the studies, and 56.6% (43/76) provided URLs for internet-sourced test data.
Conclusion
Although basic LLM identification details were relatively well reported, other key aspects, including stochasticity, prompts, and test data, were frequently underreported. Enhancing adherence to the MI-CLEAR-LLM checklist will allow LLM research to achieve greater transparency and will foster more credible and reliable future studies.
4.Association of Rapidly Elevated Plasma Tau Protein With Cognitive Decline in Patients With Amnestic Mild Cognitive Impairment and Alzheimer’s Disease
Che-Sheng CHU ; Yu-Kai LIN ; Chia-Lin TSAI ; Yueh-Feng SUNG ; Chia-Kuang TSAI ; Guan-Yu LIN ; Chien-An KO ; Yi LIU ; Chih-Sung LIANG ; Fu-Chi YANG
Psychiatry Investigation 2025;22(2):130-139
Objective:
Whether elevation in plasma levels of amyloid and tau protein biomarkers are better indicators of cognitive decline than higher baseline levels in patients with amnestic mild cognitive impairment (aMCI) and Alzheimer’s disease (AD) remains understudied.
Methods:
We included 67 participants with twice testing for AD-related plasma biomarkers via immunomagnetic reduction (IMR) assays (amyloid beta [Aβ]1-40, Aβ1-42, total tau [t-Tau], phosphorylated tau [p-Tau] 181, and alpha-synuclein [α-Syn]) and the Mini-Mental State Examination (MMSE) over a 1-year interval. We examined the correlation between biomarker levels (baseline vs. longitudinal change) and annual changes in the MMSE scores. Receiver operating characteristic curve analysis was conducted to compare the biomarkers.
Results:
After adjustment, faster cognitive decline was correlated with lower baseline levels of t-Tau (β=0.332, p=0.030) and p-Tau 181 (β=0.369, p=0.015) and rapid elevation of t-Tau (β=-0.330, p=0.030) and p-Tau 181 levels (β=-0.431, p=0.004). However, the levels (baseline and longitudinal changes) of Aβ1-40, Aβ1-42, and α-Syn were not correlated with cognitive decline. aMCI converters had lower baseline levels of p-Tau 181 (p=0.002) but larger annual changes (p=0.001) than aMCI non-converters. The change in p-Tau 181 levels showed better discriminatory capacity than the change in t-Tau levels in terms of identifying AD conversion in patients with aMCI, with an area under curve of 86.7% versus 72.2%.
Conclusion
We found changes in p-Tau 181 levels may be a suitable biomarker for identifying AD conversion.
5.Adherence of Studies on Large Language Models for Medical Applications Published in Leading Medical Journals According to the MI-CLEAR-LLM Checklist
Ji Su KO ; Hwon HEO ; Chong Hyun SUH ; Jeho YI ; Woo Hyun SHIM
Korean Journal of Radiology 2025;26(4):304-312
Objective:
To evaluate the adherence of large language model (LLM)-based healthcare research to the Minimum Reporting Items for Clear Evaluation of Accuracy Reports of Large Language Models in Healthcare (MI-CLEAR-LLM) checklist, a framework designed to enhance the transparency and reproducibility of studies on the accuracy of LLMs for medical applications.
Materials and Methods:
A systematic PubMed search was conducted to identify articles on LLM performance published in high-ranking clinical medicine journals (the top 10% in each of the 59 specialties according to the 2023 Journal Impact Factor) from November 30, 2022, through June 25, 2024. Data on the six MI-CLEAR-LLM checklist items: 1) identification and specification of the LLM used, 2) stochasticity handling, 3) prompt wording and syntax, 4) prompt structuring, 5) prompt testing and optimization, and 6) independence of the test data—were independently extracted by two reviewers, and adherence was calculated for each item.
Results:
Of 159 studies, 100% (159/159) reported the name of the LLM, 96.9% (154/159) reported the version, and 91.8% (146/159) reported the manufacturer. However, only 54.1% (86/159) reported the training data cutoff date, 6.3% (10/159) documented access to web-based information, and 50.9% (81/159) provided the date of the query attempts. Clear documentation regarding stochasticity management was provided in 15.1% (24/159) of the studies. Regarding prompt details, 49.1% (78/159) provided exact prompt wording and syntax but only 34.0% (54/159) documented prompt-structuring practices. While 46.5% (74/159) of the studies detailed prompt testing, only 15.7% (25/159) explained the rationale for specific word choices. Test data independence was reported for only 13.2% (21/159) of the studies, and 56.6% (43/76) provided URLs for internet-sourced test data.
Conclusion
Although basic LLM identification details were relatively well reported, other key aspects, including stochasticity, prompts, and test data, were frequently underreported. Enhancing adherence to the MI-CLEAR-LLM checklist will allow LLM research to achieve greater transparency and will foster more credible and reliable future studies.
6.Association of Rapidly Elevated Plasma Tau Protein With Cognitive Decline in Patients With Amnestic Mild Cognitive Impairment and Alzheimer’s Disease
Che-Sheng CHU ; Yu-Kai LIN ; Chia-Lin TSAI ; Yueh-Feng SUNG ; Chia-Kuang TSAI ; Guan-Yu LIN ; Chien-An KO ; Yi LIU ; Chih-Sung LIANG ; Fu-Chi YANG
Psychiatry Investigation 2025;22(2):130-139
Objective:
Whether elevation in plasma levels of amyloid and tau protein biomarkers are better indicators of cognitive decline than higher baseline levels in patients with amnestic mild cognitive impairment (aMCI) and Alzheimer’s disease (AD) remains understudied.
Methods:
We included 67 participants with twice testing for AD-related plasma biomarkers via immunomagnetic reduction (IMR) assays (amyloid beta [Aβ]1-40, Aβ1-42, total tau [t-Tau], phosphorylated tau [p-Tau] 181, and alpha-synuclein [α-Syn]) and the Mini-Mental State Examination (MMSE) over a 1-year interval. We examined the correlation between biomarker levels (baseline vs. longitudinal change) and annual changes in the MMSE scores. Receiver operating characteristic curve analysis was conducted to compare the biomarkers.
Results:
After adjustment, faster cognitive decline was correlated with lower baseline levels of t-Tau (β=0.332, p=0.030) and p-Tau 181 (β=0.369, p=0.015) and rapid elevation of t-Tau (β=-0.330, p=0.030) and p-Tau 181 levels (β=-0.431, p=0.004). However, the levels (baseline and longitudinal changes) of Aβ1-40, Aβ1-42, and α-Syn were not correlated with cognitive decline. aMCI converters had lower baseline levels of p-Tau 181 (p=0.002) but larger annual changes (p=0.001) than aMCI non-converters. The change in p-Tau 181 levels showed better discriminatory capacity than the change in t-Tau levels in terms of identifying AD conversion in patients with aMCI, with an area under curve of 86.7% versus 72.2%.
Conclusion
We found changes in p-Tau 181 levels may be a suitable biomarker for identifying AD conversion.
7.A Study on the Healthcare Workforce and Care for Acute Stroke: Results From the Survey of Hospitals Included in the National Acute Stroke Quality Assessment Program
Jong Young LEE ; Jun Kyeong KO ; Hak Cheol KO ; Hae-Won KOO ; Hyon-Jo KWON ; Dae-Won KIM ; Kangmin KIM ; Myeong Jin KIM ; Hoon KIM ; Keun Young PARK ; Kuhyun YANG ; Jae Sang OH ; Won Ki YOON ; Dong Hoon LEE ; Ho Jun YI ; Heui Seung LEE ; Jong-Kook RHIM ; Dong-Kyu JANG ; Youngjin JUNG ; Sang Woo HA ; Seung Hun SHEEN
Journal of Korean Medical Science 2025;40(16):e44-
Background:
With growing elderly populations, management of patients with acute stroke is increasingly important. In South Korea, the Acute Stroke Quality Assessment Program (ASQAP) has contributed to improving the quality of stroke care and practice behavior in healthcare institutions. While the mortality of hemorrhagic stroke remains high, there are only a few assessment indices associated with hemorrhagic stroke. Considering the need to develop assessment indices to improve the actual quality of care in the field of acute stroke treatment, this study aims to investigate the current status of human resources and practices related to the treatment of patients with acute stroke through a nationwide survey.
Methods:
For the healthcare institutions included in the Ninth ASQAP of the Health Insurance Review and Assessment Service (HIRA), data from January 2022 to December 2022 were collected through a survey on the current status and practice of healthcare providers related to the treatment of patients with acute stroke. The questionnaire consisted of 19 items, including six items on healthcare providers involved in stroke care and 10 items on the care of patients with acute stroke.
Results:
In the treatment of patients with hemorrhagic stroke among patients with acute stroke, neurosurgeons were the most common providers. The contribution of neurosurgeons in the treatment of ischemic stroke has also been found to be equivalent to that of neurologists. However, a number of institutions were found to be devoid of healthcare providers who perform definitive treatments, such as intra-arterial thrombectomy for patients with ischemic stroke or cerebral aneurysm clipping for subarachnoid hemorrhage. The intensity of the workload of healthcare providers involved in the care of patients with acute stroke, especially those involved in definitive treatment, was also found to be quite high.
Conclusion
Currently, there are almost no assessment indices specific to hemorrhagic stroke in the ASQAP for acute stroke. Furthermore, it does not reflect the reality of the healthcare providers and practices that provide definitive treatment for acute stroke. The findings of this study suggest the need for the development of appropriate assessment indices that reflect the realities of acute stroke care.
8.Prediction of Hemifacial Spasm Re-Appearing Phenomenon after Microvascular Decompression Surgery in Patients with Hemifacial Spasm Using Dynamic Susceptibility Contrast Perfusion Magnetic Resonance Imaging
Seung Hoon LIM ; Xiao-Yi GUO ; Hyug-Gi KIM ; Hak Cheol KO ; Soonchan PARK ; Chang-Woo RYU ; Geon-Ho JAHNG
Journal of Korean Neurosurgical Society 2025;68(1):46-59
Objective:
: Hemifacial spasm (HFS) is treated by a surgical procedure called microvascular decompression (MVD). However, HFS re-appearing phenomenon after surgery, presenting as early recurrence, is experienced by some patients after MVD. Dynamic susceptibility contrast (DSC) perfusion magnetic resonance imaging (MRI) and two analytical methods : receiver operating characteristic (ROC) curve and machine learning, were used to predict early recurrence in this study.
Methods:
: This study enrolled 60 patients who underwent MVD for HFS. They were divided into two groups : group A consisted of 32 patients who had early recurrence and group B consisted of 28 patients who had no early recurrence of HFS. DSC perfusion MRI was undergone by all patients before the surgery to obtain the several parameters. ROC curve and machine learning methods were used to predict early recurrence using these parameters.
Results:
: Group A had significantly lower relative cerebral blood flow than group B in most of the selected brain regions, as shown by the region-of-interest-based analysis. By combining three extraction fraction (EF) values at middle temporal gyrus, posterior cingulate, and brainstem, with age, using naive Bayes machine learning method, the best prediction model for early recurrence was obtained. This model had an area under the curve value of 0.845.
Conclusion
: By combining EF values with age or sex using machine learning methods, DSC perfusion MRI can be used to predict early recurrence before MVD surgery. This may help neurosurgeons to identify patients who are at risk of HFS recurrence and provide appropriate postoperative care.
9.A Study on the Healthcare Workforce and Care for Acute Stroke: Results From the Survey of Hospitals Included in the National Acute Stroke Quality Assessment Program
Jong Young LEE ; Jun Kyeong KO ; Hak Cheol KO ; Hae-Won KOO ; Hyon-Jo KWON ; Dae-Won KIM ; Kangmin KIM ; Myeong Jin KIM ; Hoon KIM ; Keun Young PARK ; Kuhyun YANG ; Jae Sang OH ; Won Ki YOON ; Dong Hoon LEE ; Ho Jun YI ; Heui Seung LEE ; Jong-Kook RHIM ; Dong-Kyu JANG ; Youngjin JUNG ; Sang Woo HA ; Seung Hun SHEEN
Journal of Korean Medical Science 2025;40(16):e44-
Background:
With growing elderly populations, management of patients with acute stroke is increasingly important. In South Korea, the Acute Stroke Quality Assessment Program (ASQAP) has contributed to improving the quality of stroke care and practice behavior in healthcare institutions. While the mortality of hemorrhagic stroke remains high, there are only a few assessment indices associated with hemorrhagic stroke. Considering the need to develop assessment indices to improve the actual quality of care in the field of acute stroke treatment, this study aims to investigate the current status of human resources and practices related to the treatment of patients with acute stroke through a nationwide survey.
Methods:
For the healthcare institutions included in the Ninth ASQAP of the Health Insurance Review and Assessment Service (HIRA), data from January 2022 to December 2022 were collected through a survey on the current status and practice of healthcare providers related to the treatment of patients with acute stroke. The questionnaire consisted of 19 items, including six items on healthcare providers involved in stroke care and 10 items on the care of patients with acute stroke.
Results:
In the treatment of patients with hemorrhagic stroke among patients with acute stroke, neurosurgeons were the most common providers. The contribution of neurosurgeons in the treatment of ischemic stroke has also been found to be equivalent to that of neurologists. However, a number of institutions were found to be devoid of healthcare providers who perform definitive treatments, such as intra-arterial thrombectomy for patients with ischemic stroke or cerebral aneurysm clipping for subarachnoid hemorrhage. The intensity of the workload of healthcare providers involved in the care of patients with acute stroke, especially those involved in definitive treatment, was also found to be quite high.
Conclusion
Currently, there are almost no assessment indices specific to hemorrhagic stroke in the ASQAP for acute stroke. Furthermore, it does not reflect the reality of the healthcare providers and practices that provide definitive treatment for acute stroke. The findings of this study suggest the need for the development of appropriate assessment indices that reflect the realities of acute stroke care.
10.Prediction of Hemifacial Spasm Re-Appearing Phenomenon after Microvascular Decompression Surgery in Patients with Hemifacial Spasm Using Dynamic Susceptibility Contrast Perfusion Magnetic Resonance Imaging
Seung Hoon LIM ; Xiao-Yi GUO ; Hyug-Gi KIM ; Hak Cheol KO ; Soonchan PARK ; Chang-Woo RYU ; Geon-Ho JAHNG
Journal of Korean Neurosurgical Society 2025;68(1):46-59
Objective:
: Hemifacial spasm (HFS) is treated by a surgical procedure called microvascular decompression (MVD). However, HFS re-appearing phenomenon after surgery, presenting as early recurrence, is experienced by some patients after MVD. Dynamic susceptibility contrast (DSC) perfusion magnetic resonance imaging (MRI) and two analytical methods : receiver operating characteristic (ROC) curve and machine learning, were used to predict early recurrence in this study.
Methods:
: This study enrolled 60 patients who underwent MVD for HFS. They were divided into two groups : group A consisted of 32 patients who had early recurrence and group B consisted of 28 patients who had no early recurrence of HFS. DSC perfusion MRI was undergone by all patients before the surgery to obtain the several parameters. ROC curve and machine learning methods were used to predict early recurrence using these parameters.
Results:
: Group A had significantly lower relative cerebral blood flow than group B in most of the selected brain regions, as shown by the region-of-interest-based analysis. By combining three extraction fraction (EF) values at middle temporal gyrus, posterior cingulate, and brainstem, with age, using naive Bayes machine learning method, the best prediction model for early recurrence was obtained. This model had an area under the curve value of 0.845.
Conclusion
: By combining EF values with age or sex using machine learning methods, DSC perfusion MRI can be used to predict early recurrence before MVD surgery. This may help neurosurgeons to identify patients who are at risk of HFS recurrence and provide appropriate postoperative care.

Result Analysis
Print
Save
E-mail