1.Advancing Korean Medical Large Language Models: Automated Pipeline for Korean Medical Preference Dataset Construction
Jean SEO ; Sumin PARK ; Sungjoo BYUN ; Jinwook CHOI ; Jinho CHOI ; Hyopil SHIN
Healthcare Informatics Research 2025;31(2):166-174
Objectives:
Developing large language models (LLMs) in biomedicine requires access to high-quality training and alignment tuning datasets. However, publicly available Korean medical preference datasets are scarce, hindering the advancement of Korean medical LLMs. This study constructs and evaluates the efficacy of the Korean Medical Preference Dataset (KoMeP), an alignment tuning dataset constructed with an automated pipeline, minimizing the high costs of human annotation.
Methods:
KoMeP was generated using the DAHL score, an automated hallucination evaluation metric. Five LLMs (Dolly-v2-3B, MPT-7B, GPT-4o, Qwen-2-7B, Llama-3-8B) produced responses to 8,573 biomedical examination questions, from which 5,551 preference pairs were extracted. Each pair consisted of a “chosen” response and a “rejected” response, as determined by their DAHL scores. The dataset was evaluated when trained through two different alignment tuning methods, direct preference optimization (DPO) and odds ratio preference optimization (ORPO) respectively across five different models. The KorMedMCQA benchmark was employed to assess the effectiveness of alignment tuning.
Results:
Models trained with DPO consistently improved KorMedMCQA performance; notably, Llama-3.1-8B showed a 43.96% increase. In contrast, ORPO training produced inconsistent results. Additionally, English-to-Korean transfer learning proved effective, particularly for English-centric models like Gemma-2, whereas Korean-to-English transfer learning achieved limited success. Instruction tuning with KoMeP yielded mixed outcomes, which suggests challenges in dataset formatting.
Conclusions
KoMeP is the first publicly available Korean medical preference dataset and significantly improves alignment tuning performance in LLMs. The DPO method outperforms ORPO in alignment tuning. Future work should focus on expanding KoMeP, developing a Korean-native dataset, and refining alignment tuning methods to produce safer and more reliable Korean medical LLMs.
2.Era of Digital Healthcare: Emergence of the Smart Patient
Dooyoung HUHH ; Kwangsoo SHIN ; Miyeong KIM ; Jisan LEE ; Hana KIM ; Jinho CHOI ; Suyeon BAN
Healthcare Informatics Research 2025;31(1):107-110
3.Advancing Korean Medical Large Language Models: Automated Pipeline for Korean Medical Preference Dataset Construction
Jean SEO ; Sumin PARK ; Sungjoo BYUN ; Jinwook CHOI ; Jinho CHOI ; Hyopil SHIN
Healthcare Informatics Research 2025;31(2):166-174
Objectives:
Developing large language models (LLMs) in biomedicine requires access to high-quality training and alignment tuning datasets. However, publicly available Korean medical preference datasets are scarce, hindering the advancement of Korean medical LLMs. This study constructs and evaluates the efficacy of the Korean Medical Preference Dataset (KoMeP), an alignment tuning dataset constructed with an automated pipeline, minimizing the high costs of human annotation.
Methods:
KoMeP was generated using the DAHL score, an automated hallucination evaluation metric. Five LLMs (Dolly-v2-3B, MPT-7B, GPT-4o, Qwen-2-7B, Llama-3-8B) produced responses to 8,573 biomedical examination questions, from which 5,551 preference pairs were extracted. Each pair consisted of a “chosen” response and a “rejected” response, as determined by their DAHL scores. The dataset was evaluated when trained through two different alignment tuning methods, direct preference optimization (DPO) and odds ratio preference optimization (ORPO) respectively across five different models. The KorMedMCQA benchmark was employed to assess the effectiveness of alignment tuning.
Results:
Models trained with DPO consistently improved KorMedMCQA performance; notably, Llama-3.1-8B showed a 43.96% increase. In contrast, ORPO training produced inconsistent results. Additionally, English-to-Korean transfer learning proved effective, particularly for English-centric models like Gemma-2, whereas Korean-to-English transfer learning achieved limited success. Instruction tuning with KoMeP yielded mixed outcomes, which suggests challenges in dataset formatting.
Conclusions
KoMeP is the first publicly available Korean medical preference dataset and significantly improves alignment tuning performance in LLMs. The DPO method outperforms ORPO in alignment tuning. Future work should focus on expanding KoMeP, developing a Korean-native dataset, and refining alignment tuning methods to produce safer and more reliable Korean medical LLMs.
4.Era of Digital Healthcare: Emergence of the Smart Patient
Dooyoung HUHH ; Kwangsoo SHIN ; Miyeong KIM ; Jisan LEE ; Hana KIM ; Jinho CHOI ; Suyeon BAN
Healthcare Informatics Research 2025;31(1):107-110
5.Advancing Korean Medical Large Language Models: Automated Pipeline for Korean Medical Preference Dataset Construction
Jean SEO ; Sumin PARK ; Sungjoo BYUN ; Jinwook CHOI ; Jinho CHOI ; Hyopil SHIN
Healthcare Informatics Research 2025;31(2):166-174
Objectives:
Developing large language models (LLMs) in biomedicine requires access to high-quality training and alignment tuning datasets. However, publicly available Korean medical preference datasets are scarce, hindering the advancement of Korean medical LLMs. This study constructs and evaluates the efficacy of the Korean Medical Preference Dataset (KoMeP), an alignment tuning dataset constructed with an automated pipeline, minimizing the high costs of human annotation.
Methods:
KoMeP was generated using the DAHL score, an automated hallucination evaluation metric. Five LLMs (Dolly-v2-3B, MPT-7B, GPT-4o, Qwen-2-7B, Llama-3-8B) produced responses to 8,573 biomedical examination questions, from which 5,551 preference pairs were extracted. Each pair consisted of a “chosen” response and a “rejected” response, as determined by their DAHL scores. The dataset was evaluated when trained through two different alignment tuning methods, direct preference optimization (DPO) and odds ratio preference optimization (ORPO) respectively across five different models. The KorMedMCQA benchmark was employed to assess the effectiveness of alignment tuning.
Results:
Models trained with DPO consistently improved KorMedMCQA performance; notably, Llama-3.1-8B showed a 43.96% increase. In contrast, ORPO training produced inconsistent results. Additionally, English-to-Korean transfer learning proved effective, particularly for English-centric models like Gemma-2, whereas Korean-to-English transfer learning achieved limited success. Instruction tuning with KoMeP yielded mixed outcomes, which suggests challenges in dataset formatting.
Conclusions
KoMeP is the first publicly available Korean medical preference dataset and significantly improves alignment tuning performance in LLMs. The DPO method outperforms ORPO in alignment tuning. Future work should focus on expanding KoMeP, developing a Korean-native dataset, and refining alignment tuning methods to produce safer and more reliable Korean medical LLMs.
6.Era of Digital Healthcare: Emergence of the Smart Patient
Dooyoung HUHH ; Kwangsoo SHIN ; Miyeong KIM ; Jisan LEE ; Hana KIM ; Jinho CHOI ; Suyeon BAN
Healthcare Informatics Research 2025;31(1):107-110
7.Prognostic Value of Residual Circulating Tumor DNA in Metastatic Pancreatic Ductal Adenocarcinoma
Hongkyung KIM ; Jinho LEE ; Mi Ri PARK ; Zisun CHOI ; Seung Jung HAN ; Dongha KIM ; Saeam SHIN ; Seung-Tae LEE ; Jong Rak CHOI ; Seung Woo PARK
Annals of Laboratory Medicine 2025;45(2):199-208
Background:
Circulating tumor DNA (ctDNA) is a potential biomarker in pancreatic ductal adenocarcinoma (PDAC). However, studies on residual ctDNA in patients post-chemotherapy are limited. We assessed the prognostic value of residual ctDNA in metastatic PDAC relative to that of carbohydrate antigen 19-9 (CA19-9).
Methods:
ctDNA analysis using a targeted next-generation sequencing panel was performed at baseline and during chemotherapy response evaluation in 53 patients. Progression-free survival (PFS) and overall survival (OS) were first evaluated based on ctDNA positivity at baseline. For further comparison, patients testing ctDNA-positive at baseline were subdivided based on residual ctDNA into ctDNA responders (no residual ctDNA post-chemotherapy) and ctDNA non-responders (residual ctDNA post-chemotherapy). Additional survival analysis was performed based on CA19-9 levels.
Results:
The baseline ctDNA detection rate was 56.6%. Although clinical outcomes tended to be poorer in patients with baseline ctDNA positivity than in those without, the differences were not significant. Residual ctDNA post-chemotherapy was associated with reduced PFS and OS. The prognosis of ctDNA responders was better than that of non-responders but did not significantly differ from that of ctDNA-negative individuals (no ctDNA both at baseline and during post-chemotherapy). Compared with ctDNA responses to che-motherapy, a ≥ 50% decrease in the CA19-9 level had less effect on both PFS and OSbased on hazard ratios and significance levels. ctDNA could be monitored in half of the patients whose baseline CA19-9 levels were within the reference range.
Conclusions
Residual ctDNA analysis post-chemotherapy is a promising approach for predicting the clinical outcomes of patients with metastatic PDAC.
8.Development and Validation of the Korean Version of the Edinburgh Cognitive and Behavioral Amyotrophic Lateral Sclerosis Screen (ECAS-K)
Jeeun LEE ; Ahwon KIM ; Seok-Jin CHOI ; Eric CHO ; Jaeyoung SEO ; Seong-il OH ; Jinho JUNG ; Ji-Sun KIM ; Jung-Joon SUNG ; Sharon ABRAHAMS ; Yoon-Ho HONG
Journal of Clinical Neurology 2024;20(6):637-637
9.Development and Validation of the Korean Version of the Edinburgh Cognitive and Behavioral Amyotrophic Lateral Sclerosis Screen (ECAS-K)
Jeeun LEE ; Ahwon KIM ; Seok-Jin CHOI ; Eric CHO ; Jaeyoung SEO ; Seong-il OH ; Jinho JUNG ; Ji-Sun KIM ; Jung-Joon SUNG ; Sharon ABRAHAMS ; Yoon-Ho HONG
Journal of Clinical Neurology 2024;20(6):637-637
10.Development and Validation of the Korean Version of the Edinburgh Cognitive and Behavioral Amyotrophic Lateral Sclerosis Screen (ECAS-K)
Jeeun LEE ; Ahwon KIM ; Seok-Jin CHOI ; Eric CHO ; Jaeyoung SEO ; Seong-il OH ; Jinho JUNG ; Ji-Sun KIM ; Jung-Joon SUNG ; Sharon ABRAHAMS ; Yoon-Ho HONG
Journal of Clinical Neurology 2024;20(6):637-637

Result Analysis
Print
Save
E-mail