1.Advancing Korean Medical Large Language Models: Automated Pipeline for Korean Medical Preference Dataset Construction
Jean SEO ; Sumin PARK ; Sungjoo BYUN ; Jinwook CHOI ; Jinho CHOI ; Hyopil SHIN
Healthcare Informatics Research 2025;31(2):166-174
Objectives:
Developing large language models (LLMs) in biomedicine requires access to high-quality training and alignment tuning datasets. However, publicly available Korean medical preference datasets are scarce, hindering the advancement of Korean medical LLMs. This study constructs and evaluates the efficacy of the Korean Medical Preference Dataset (KoMeP), an alignment tuning dataset constructed with an automated pipeline, minimizing the high costs of human annotation.
Methods:
KoMeP was generated using the DAHL score, an automated hallucination evaluation metric. Five LLMs (Dolly-v2-3B, MPT-7B, GPT-4o, Qwen-2-7B, Llama-3-8B) produced responses to 8,573 biomedical examination questions, from which 5,551 preference pairs were extracted. Each pair consisted of a “chosen” response and a “rejected” response, as determined by their DAHL scores. The dataset was evaluated when trained through two different alignment tuning methods, direct preference optimization (DPO) and odds ratio preference optimization (ORPO) respectively across five different models. The KorMedMCQA benchmark was employed to assess the effectiveness of alignment tuning.
Results:
Models trained with DPO consistently improved KorMedMCQA performance; notably, Llama-3.1-8B showed a 43.96% increase. In contrast, ORPO training produced inconsistent results. Additionally, English-to-Korean transfer learning proved effective, particularly for English-centric models like Gemma-2, whereas Korean-to-English transfer learning achieved limited success. Instruction tuning with KoMeP yielded mixed outcomes, which suggests challenges in dataset formatting.
Conclusions
KoMeP is the first publicly available Korean medical preference dataset and significantly improves alignment tuning performance in LLMs. The DPO method outperforms ORPO in alignment tuning. Future work should focus on expanding KoMeP, developing a Korean-native dataset, and refining alignment tuning methods to produce safer and more reliable Korean medical LLMs.
2.Enhancing Identification of High-Risk cN0 Lung Adenocarcinoma Patients Using MRI-Based Radiomic Features
Harim KIM ; Jonghoon KIM ; Soohyun HWANG ; You Jin OH ; Joong Hyun AHN ; Min-Ji KIM ; Tae Hee HONG ; Sung Goo PARK ; Joon Young CHOI ; Hong Kwan KIM ; Jhingook KIM ; Sumin SHIN ; Ho Yun LEE
Cancer Research and Treatment 2025;57(1):57-69
Purpose:
This study aimed to develop a magnetic resonance imaging (MRI)–based radiomics model to predict high-risk pathologic features for lung adenocarcinoma: micropapillary and solid pattern (MPsol), spread through air space, and poorly differentiated patterns.
Materials and Methods:
As a prospective study, we screened clinical N0 lung cancer patients who were surgical candidates and had undergone both 18F-fluorodeoxyglucose (FDG) positron emission tomography–computed tomography (PET/CT) and chest CT from August 2018 to January 2020. We recruited patients meeting our proposed imaging criteria indicating high-risk, that is, poorer prognosis of lung adenocarcinoma, using CT and FDG PET/CT. If possible, these patients underwent an MRI examination from which we extracted 77 radiomics features from T1-contrast-enhanced and T2-weighted images. Additionally, patient demographics, maximum standardized uptake value on FDG PET/CT, and the mean apparent diffusion coefficient value on diffusion-weighted image, were considered together to build prediction models for high-risk pathologic features.
Results:
Among 616 patients, 72 patients met the imaging criteria for high-risk lung cancer and underwent lung MRI. The magnetic resonance (MR)–eligible group showed a higher prevalence of nodal upstaging (29.2% vs. 4.2%, p < 0.001), vascular invasion (6.5% vs. 2.1%, p=0.011), high-grade pathologic features (p < 0.001), worse 4-year disease-free survival (p < 0.001) compared with non-MR-eligible group. The prediction power for MR-based radiomics model predicting high-risk pathologic features was good, with mean area under the receiver operating curve (AUC) value measuring 0.751-0.886 in test sets. Adding clinical variables increased the predictive performance for MPsol and the poorly differentiated pattern using the 2021 grading system (AUC, 0.860 and 0.907, respectively).
Conclusion
Our imaging criteria can effectively screen high-risk lung cancer patients and predict high-risk pathologic features by our MR-based prediction model using radiomics.
3.Enhancing Identification of High-Risk cN0 Lung Adenocarcinoma Patients Using MRI-Based Radiomic Features
Harim KIM ; Jonghoon KIM ; Soohyun HWANG ; You Jin OH ; Joong Hyun AHN ; Min-Ji KIM ; Tae Hee HONG ; Sung Goo PARK ; Joon Young CHOI ; Hong Kwan KIM ; Jhingook KIM ; Sumin SHIN ; Ho Yun LEE
Cancer Research and Treatment 2025;57(1):57-69
Purpose:
This study aimed to develop a magnetic resonance imaging (MRI)–based radiomics model to predict high-risk pathologic features for lung adenocarcinoma: micropapillary and solid pattern (MPsol), spread through air space, and poorly differentiated patterns.
Materials and Methods:
As a prospective study, we screened clinical N0 lung cancer patients who were surgical candidates and had undergone both 18F-fluorodeoxyglucose (FDG) positron emission tomography–computed tomography (PET/CT) and chest CT from August 2018 to January 2020. We recruited patients meeting our proposed imaging criteria indicating high-risk, that is, poorer prognosis of lung adenocarcinoma, using CT and FDG PET/CT. If possible, these patients underwent an MRI examination from which we extracted 77 radiomics features from T1-contrast-enhanced and T2-weighted images. Additionally, patient demographics, maximum standardized uptake value on FDG PET/CT, and the mean apparent diffusion coefficient value on diffusion-weighted image, were considered together to build prediction models for high-risk pathologic features.
Results:
Among 616 patients, 72 patients met the imaging criteria for high-risk lung cancer and underwent lung MRI. The magnetic resonance (MR)–eligible group showed a higher prevalence of nodal upstaging (29.2% vs. 4.2%, p < 0.001), vascular invasion (6.5% vs. 2.1%, p=0.011), high-grade pathologic features (p < 0.001), worse 4-year disease-free survival (p < 0.001) compared with non-MR-eligible group. The prediction power for MR-based radiomics model predicting high-risk pathologic features was good, with mean area under the receiver operating curve (AUC) value measuring 0.751-0.886 in test sets. Adding clinical variables increased the predictive performance for MPsol and the poorly differentiated pattern using the 2021 grading system (AUC, 0.860 and 0.907, respectively).
Conclusion
Our imaging criteria can effectively screen high-risk lung cancer patients and predict high-risk pathologic features by our MR-based prediction model using radiomics.
4.Advancing Korean Medical Large Language Models: Automated Pipeline for Korean Medical Preference Dataset Construction
Jean SEO ; Sumin PARK ; Sungjoo BYUN ; Jinwook CHOI ; Jinho CHOI ; Hyopil SHIN
Healthcare Informatics Research 2025;31(2):166-174
Objectives:
Developing large language models (LLMs) in biomedicine requires access to high-quality training and alignment tuning datasets. However, publicly available Korean medical preference datasets are scarce, hindering the advancement of Korean medical LLMs. This study constructs and evaluates the efficacy of the Korean Medical Preference Dataset (KoMeP), an alignment tuning dataset constructed with an automated pipeline, minimizing the high costs of human annotation.
Methods:
KoMeP was generated using the DAHL score, an automated hallucination evaluation metric. Five LLMs (Dolly-v2-3B, MPT-7B, GPT-4o, Qwen-2-7B, Llama-3-8B) produced responses to 8,573 biomedical examination questions, from which 5,551 preference pairs were extracted. Each pair consisted of a “chosen” response and a “rejected” response, as determined by their DAHL scores. The dataset was evaluated when trained through two different alignment tuning methods, direct preference optimization (DPO) and odds ratio preference optimization (ORPO) respectively across five different models. The KorMedMCQA benchmark was employed to assess the effectiveness of alignment tuning.
Results:
Models trained with DPO consistently improved KorMedMCQA performance; notably, Llama-3.1-8B showed a 43.96% increase. In contrast, ORPO training produced inconsistent results. Additionally, English-to-Korean transfer learning proved effective, particularly for English-centric models like Gemma-2, whereas Korean-to-English transfer learning achieved limited success. Instruction tuning with KoMeP yielded mixed outcomes, which suggests challenges in dataset formatting.
Conclusions
KoMeP is the first publicly available Korean medical preference dataset and significantly improves alignment tuning performance in LLMs. The DPO method outperforms ORPO in alignment tuning. Future work should focus on expanding KoMeP, developing a Korean-native dataset, and refining alignment tuning methods to produce safer and more reliable Korean medical LLMs.
5.Enhancing Identification of High-Risk cN0 Lung Adenocarcinoma Patients Using MRI-Based Radiomic Features
Harim KIM ; Jonghoon KIM ; Soohyun HWANG ; You Jin OH ; Joong Hyun AHN ; Min-Ji KIM ; Tae Hee HONG ; Sung Goo PARK ; Joon Young CHOI ; Hong Kwan KIM ; Jhingook KIM ; Sumin SHIN ; Ho Yun LEE
Cancer Research and Treatment 2025;57(1):57-69
Purpose:
This study aimed to develop a magnetic resonance imaging (MRI)–based radiomics model to predict high-risk pathologic features for lung adenocarcinoma: micropapillary and solid pattern (MPsol), spread through air space, and poorly differentiated patterns.
Materials and Methods:
As a prospective study, we screened clinical N0 lung cancer patients who were surgical candidates and had undergone both 18F-fluorodeoxyglucose (FDG) positron emission tomography–computed tomography (PET/CT) and chest CT from August 2018 to January 2020. We recruited patients meeting our proposed imaging criteria indicating high-risk, that is, poorer prognosis of lung adenocarcinoma, using CT and FDG PET/CT. If possible, these patients underwent an MRI examination from which we extracted 77 radiomics features from T1-contrast-enhanced and T2-weighted images. Additionally, patient demographics, maximum standardized uptake value on FDG PET/CT, and the mean apparent diffusion coefficient value on diffusion-weighted image, were considered together to build prediction models for high-risk pathologic features.
Results:
Among 616 patients, 72 patients met the imaging criteria for high-risk lung cancer and underwent lung MRI. The magnetic resonance (MR)–eligible group showed a higher prevalence of nodal upstaging (29.2% vs. 4.2%, p < 0.001), vascular invasion (6.5% vs. 2.1%, p=0.011), high-grade pathologic features (p < 0.001), worse 4-year disease-free survival (p < 0.001) compared with non-MR-eligible group. The prediction power for MR-based radiomics model predicting high-risk pathologic features was good, with mean area under the receiver operating curve (AUC) value measuring 0.751-0.886 in test sets. Adding clinical variables increased the predictive performance for MPsol and the poorly differentiated pattern using the 2021 grading system (AUC, 0.860 and 0.907, respectively).
Conclusion
Our imaging criteria can effectively screen high-risk lung cancer patients and predict high-risk pathologic features by our MR-based prediction model using radiomics.
6.Advancing Korean Medical Large Language Models: Automated Pipeline for Korean Medical Preference Dataset Construction
Jean SEO ; Sumin PARK ; Sungjoo BYUN ; Jinwook CHOI ; Jinho CHOI ; Hyopil SHIN
Healthcare Informatics Research 2025;31(2):166-174
Objectives:
Developing large language models (LLMs) in biomedicine requires access to high-quality training and alignment tuning datasets. However, publicly available Korean medical preference datasets are scarce, hindering the advancement of Korean medical LLMs. This study constructs and evaluates the efficacy of the Korean Medical Preference Dataset (KoMeP), an alignment tuning dataset constructed with an automated pipeline, minimizing the high costs of human annotation.
Methods:
KoMeP was generated using the DAHL score, an automated hallucination evaluation metric. Five LLMs (Dolly-v2-3B, MPT-7B, GPT-4o, Qwen-2-7B, Llama-3-8B) produced responses to 8,573 biomedical examination questions, from which 5,551 preference pairs were extracted. Each pair consisted of a “chosen” response and a “rejected” response, as determined by their DAHL scores. The dataset was evaluated when trained through two different alignment tuning methods, direct preference optimization (DPO) and odds ratio preference optimization (ORPO) respectively across five different models. The KorMedMCQA benchmark was employed to assess the effectiveness of alignment tuning.
Results:
Models trained with DPO consistently improved KorMedMCQA performance; notably, Llama-3.1-8B showed a 43.96% increase. In contrast, ORPO training produced inconsistent results. Additionally, English-to-Korean transfer learning proved effective, particularly for English-centric models like Gemma-2, whereas Korean-to-English transfer learning achieved limited success. Instruction tuning with KoMeP yielded mixed outcomes, which suggests challenges in dataset formatting.
Conclusions
KoMeP is the first publicly available Korean medical preference dataset and significantly improves alignment tuning performance in LLMs. The DPO method outperforms ORPO in alignment tuning. Future work should focus on expanding KoMeP, developing a Korean-native dataset, and refining alignment tuning methods to produce safer and more reliable Korean medical LLMs.
7.Psychometric Validation of the Korean Version of the Cancer Survivors’ Unmet Needs (CaSUN) Scale among Korean Non–Small Cell Lung Cancer Survivors
Danbee KANG ; Genehee LEE ; Sooyeon KIM ; Heesu NAM ; Sunga KONG ; Sungkeun SHIM ; Jae Kyung LEE ; Wonyoung JUNG ; Sumin SHIN ; Hong Kwan KIM ; Jae Ill ZO ; Young Mog SHIM ; Dong Wook SHIN ; Juhee CHO
Cancer Research and Treatment 2023;55(1):61-72
Purpose:
The purpose of the study was to validate the Korean version of Cancer Survivors’ Unmet Needs (CaSUN) scale among non–small cell lung cancer survivors.
Materials and Methods:
Participants were recruited from outpatient clinics at the Samsung Medical Center in Seoul, South Korea, from January to October 2020. Participants completed a survey questionnaire that included the CaSUN. Exploratory and confirmatory factor analysis and Pearson’s correlations were used to evaluate the reliability and validity of the Korean version of the CaSUN (CaSUN-K). We also tested known-group validity using an independent t test or ANOVA.
Results:
In total, 949 provided informed consent and all of which completed the questionnaire. Among the 949 patients, 529 (55.7%) were male; the mean age and median time since the end of active treatment (standard deviation) was 63.4±8.8 years and the median was 18 months. Although the factor loadings were different from those for the original scale, the Cronbach’s alpha coefficients of the six domains in the CaSUN-K ranged from 0.68 to 0.95, indicating satisfactory internal consistency. In the CFA, the goodness-of-fit indices for the CaSUN-K were high. Moderate correlations demonstrated the convergent validity of CaSUN-K with the relevant questionnaire. More than 60% of the participants reported information-related unmet needs, and the CaSUN-K discriminated between the needs reported by the different subgroups that we analyzed.
Conclusion
The CaSUN-K is a reliable and valid measure for assessing the unmet needs in a cancer population, thus this tool help population to receive timely, targeted, and relevant care.
8.Outcomes of robotic sacrocolpopexy
Obstetrics & Gynecology Science 2023;66(6):509-517
This review aimed to summarize the complications and surgical outcomes of robot-assisted sacrocolpopexy. Nineteen original articles on 1,440 robotic sacrocolpopexies were reviewed, and three systematic reviews and meta-analyses were summarized in terms of intraoperative, perioperative, postoperative, and/or surgical outcomes. Robotic sacrocolpopexy has demonstrated low overall complication rates and favorable surgical outcomes. Nevertheless, long-term follow-up outcomes regarding objective and/or subjective prolapse recurrence, reoperation rates, and mesh-related complications remain unclear. Further research is required to demonstrate whether the robotic approach for sacrocolpopexy is feasible or can become the modality of choice in the future when performing sacrocolpopexy.
9.Clinical Features of Group B Streptococcus Colonization in Vagina During Late Pregnancy at a Primary Maternity Hospital
Journal of the Korean Society of Maternal and Child Health 2022;26(1):27-34
Purpose:
The aim of this study was to assess the epidemiologic and clinical features of maternal Group B Streptococcus (GBS) colonization in vagina during their third trimester.
Methods:
This study included 644 pregnant women who had undergone GBS culture test in their third trimester in 2018. We collected data from the primary level of care maternity hospital through the retrospective chart review. We compared patients’ demographics, maternal obstetrical complications and neonatal adverse events between GBS positive (n=41) and GBS negative (n=603) groups. To find out clinical predictors of GBS positive result, univariable chi-square test and multivariable logistic regression analysis were applied.
Results:
The colonization rate of GBS in maternal vagina was 6.4% in their third trimester. GBS positive group showed significant association with the third trimester anemia (hemoglobin level <10.5 g/dL) (p=0.013) and oligohydramnios (p=0.024; odds ratio, 7.32; 95% confidence interval, 1.28–41.31). All specimens were susceptible to penicillin G and cephalosporin. The antibiotic resistance to both erythromycin and clindamycin was 31%.
Conclusion
The colonization rate of GBS in maternal vagina was 6.4% and third trimester maternal anemia was associated with the GBS carrying status.
10.Guidelines for accreditation of endoscopy units: quality measures from the Korean Society of Coloproctology
Rumi SHIN ; Seongdae LEE ; Kyung-Su HAN ; Dae Kyung SOHN ; Sang Hui MOON ; Dong Hyun CHOI ; Bong-Hyeon KYE ; Hae-Jung SON ; Sun Il LEE ; Sumin SI ; Won-Kyung KANG
Annals of Surgical Treatment and Research 2021;100(3):154-165
Purpose:
Colonoscopy is an effective method of screening for colorectal cancer (CRC), and it can prevent CRC by detection and removal of precancerous lesions. The most important considerations when performing colonoscopy screening are the safety and satisfaction of the patient and the diagnostic accuracy. Accordingly, the Korean Society of Coloproctology (KSCP) herein proposes an optimal level of standard performance to be used in endoscopy units and by individual colonoscopists for screening colonoscopy. These guidelines establish specific criteria for assessment of safety and quality in screening colonoscopy.
Methods:
The Colonoscopy Committee of the KSCP commissioned this Position Statement. Expert gastrointestinal surgeons representing the KSCP reviewed the published evidence to identify acceptable quality indicators and indicators that lacked sufficient evidence.
Results:
The KSCP recommends an optimal standard list for quality control of screening colonoscopy in the following 6 categories: training and competency of the colonoscopist, procedural quality, facilities and equipment, performance indicators and auditable outcomes, disinfection of equipment, and sedation and recovery of the patient.
Conclusion
The KSCP recommends that endoscopy units performing CRC screening evaluate 6 key performance measures during daily practice.

Result Analysis
Print
Save
E-mail