1.Advancing Korean Medical Large Language Models: Automated Pipeline for Korean Medical Preference Dataset Construction
Jean SEO ; Sumin PARK ; Sungjoo BYUN ; Jinwook CHOI ; Jinho CHOI ; Hyopil SHIN
Healthcare Informatics Research 2025;31(2):166-174
Objectives:
Developing large language models (LLMs) in biomedicine requires access to high-quality training and alignment tuning datasets. However, publicly available Korean medical preference datasets are scarce, hindering the advancement of Korean medical LLMs. This study constructs and evaluates the efficacy of the Korean Medical Preference Dataset (KoMeP), an alignment tuning dataset constructed with an automated pipeline, minimizing the high costs of human annotation.
Methods:
KoMeP was generated using the DAHL score, an automated hallucination evaluation metric. Five LLMs (Dolly-v2-3B, MPT-7B, GPT-4o, Qwen-2-7B, Llama-3-8B) produced responses to 8,573 biomedical examination questions, from which 5,551 preference pairs were extracted. Each pair consisted of a “chosen” response and a “rejected” response, as determined by their DAHL scores. The dataset was evaluated when trained through two different alignment tuning methods, direct preference optimization (DPO) and odds ratio preference optimization (ORPO) respectively across five different models. The KorMedMCQA benchmark was employed to assess the effectiveness of alignment tuning.
Results:
Models trained with DPO consistently improved KorMedMCQA performance; notably, Llama-3.1-8B showed a 43.96% increase. In contrast, ORPO training produced inconsistent results. Additionally, English-to-Korean transfer learning proved effective, particularly for English-centric models like Gemma-2, whereas Korean-to-English transfer learning achieved limited success. Instruction tuning with KoMeP yielded mixed outcomes, which suggests challenges in dataset formatting.
Conclusions
KoMeP is the first publicly available Korean medical preference dataset and significantly improves alignment tuning performance in LLMs. The DPO method outperforms ORPO in alignment tuning. Future work should focus on expanding KoMeP, developing a Korean-native dataset, and refining alignment tuning methods to produce safer and more reliable Korean medical LLMs.
2.Era of Digital Healthcare: Emergence of the Smart Patient
Dooyoung HUHH ; Kwangsoo SHIN ; Miyeong KIM ; Jisan LEE ; Hana KIM ; Jinho CHOI ; Suyeon BAN
Healthcare Informatics Research 2025;31(1):107-110
3.Advancing Korean Medical Large Language Models: Automated Pipeline for Korean Medical Preference Dataset Construction
Jean SEO ; Sumin PARK ; Sungjoo BYUN ; Jinwook CHOI ; Jinho CHOI ; Hyopil SHIN
Healthcare Informatics Research 2025;31(2):166-174
Objectives:
Developing large language models (LLMs) in biomedicine requires access to high-quality training and alignment tuning datasets. However, publicly available Korean medical preference datasets are scarce, hindering the advancement of Korean medical LLMs. This study constructs and evaluates the efficacy of the Korean Medical Preference Dataset (KoMeP), an alignment tuning dataset constructed with an automated pipeline, minimizing the high costs of human annotation.
Methods:
KoMeP was generated using the DAHL score, an automated hallucination evaluation metric. Five LLMs (Dolly-v2-3B, MPT-7B, GPT-4o, Qwen-2-7B, Llama-3-8B) produced responses to 8,573 biomedical examination questions, from which 5,551 preference pairs were extracted. Each pair consisted of a “chosen” response and a “rejected” response, as determined by their DAHL scores. The dataset was evaluated when trained through two different alignment tuning methods, direct preference optimization (DPO) and odds ratio preference optimization (ORPO) respectively across five different models. The KorMedMCQA benchmark was employed to assess the effectiveness of alignment tuning.
Results:
Models trained with DPO consistently improved KorMedMCQA performance; notably, Llama-3.1-8B showed a 43.96% increase. In contrast, ORPO training produced inconsistent results. Additionally, English-to-Korean transfer learning proved effective, particularly for English-centric models like Gemma-2, whereas Korean-to-English transfer learning achieved limited success. Instruction tuning with KoMeP yielded mixed outcomes, which suggests challenges in dataset formatting.
Conclusions
KoMeP is the first publicly available Korean medical preference dataset and significantly improves alignment tuning performance in LLMs. The DPO method outperforms ORPO in alignment tuning. Future work should focus on expanding KoMeP, developing a Korean-native dataset, and refining alignment tuning methods to produce safer and more reliable Korean medical LLMs.
4.Era of Digital Healthcare: Emergence of the Smart Patient
Dooyoung HUHH ; Kwangsoo SHIN ; Miyeong KIM ; Jisan LEE ; Hana KIM ; Jinho CHOI ; Suyeon BAN
Healthcare Informatics Research 2025;31(1):107-110
5.Advancing Korean Medical Large Language Models: Automated Pipeline for Korean Medical Preference Dataset Construction
Jean SEO ; Sumin PARK ; Sungjoo BYUN ; Jinwook CHOI ; Jinho CHOI ; Hyopil SHIN
Healthcare Informatics Research 2025;31(2):166-174
Objectives:
Developing large language models (LLMs) in biomedicine requires access to high-quality training and alignment tuning datasets. However, publicly available Korean medical preference datasets are scarce, hindering the advancement of Korean medical LLMs. This study constructs and evaluates the efficacy of the Korean Medical Preference Dataset (KoMeP), an alignment tuning dataset constructed with an automated pipeline, minimizing the high costs of human annotation.
Methods:
KoMeP was generated using the DAHL score, an automated hallucination evaluation metric. Five LLMs (Dolly-v2-3B, MPT-7B, GPT-4o, Qwen-2-7B, Llama-3-8B) produced responses to 8,573 biomedical examination questions, from which 5,551 preference pairs were extracted. Each pair consisted of a “chosen” response and a “rejected” response, as determined by their DAHL scores. The dataset was evaluated when trained through two different alignment tuning methods, direct preference optimization (DPO) and odds ratio preference optimization (ORPO) respectively across five different models. The KorMedMCQA benchmark was employed to assess the effectiveness of alignment tuning.
Results:
Models trained with DPO consistently improved KorMedMCQA performance; notably, Llama-3.1-8B showed a 43.96% increase. In contrast, ORPO training produced inconsistent results. Additionally, English-to-Korean transfer learning proved effective, particularly for English-centric models like Gemma-2, whereas Korean-to-English transfer learning achieved limited success. Instruction tuning with KoMeP yielded mixed outcomes, which suggests challenges in dataset formatting.
Conclusions
KoMeP is the first publicly available Korean medical preference dataset and significantly improves alignment tuning performance in LLMs. The DPO method outperforms ORPO in alignment tuning. Future work should focus on expanding KoMeP, developing a Korean-native dataset, and refining alignment tuning methods to produce safer and more reliable Korean medical LLMs.
6.Era of Digital Healthcare: Emergence of the Smart Patient
Dooyoung HUHH ; Kwangsoo SHIN ; Miyeong KIM ; Jisan LEE ; Hana KIM ; Jinho CHOI ; Suyeon BAN
Healthcare Informatics Research 2025;31(1):107-110
7.Development and Validation of the Korean Version of the Edinburgh Cognitive and Behavioral Amyotrophic Lateral Sclerosis Screen (ECAS-K)
Jeeun LEE ; Ahwon KIM ; Seok-Jin CHOI ; Eric CHO ; Jaeyoung SEO ; Seong-il OH ; Jinho JUNG ; Ji-Sun KIM ; Jung-Joon SUNG ; Sharon ABRAHAMS ; Yoon-Ho HONG
Journal of Clinical Neurology 2024;20(6):637-637
8.Development and Validation of the Korean Version of the Edinburgh Cognitive and Behavioral Amyotrophic Lateral Sclerosis Screen (ECAS-K)
Jeeun LEE ; Ahwon KIM ; Seok-Jin CHOI ; Eric CHO ; Jaeyoung SEO ; Seong-il OH ; Jinho JUNG ; Ji-Sun KIM ; Jung-Joon SUNG ; Sharon ABRAHAMS ; Yoon-Ho HONG
Journal of Clinical Neurology 2024;20(6):637-637
9.Development and Validation of the Korean Version of the Edinburgh Cognitive and Behavioral Amyotrophic Lateral Sclerosis Screen (ECAS-K)
Jeeun LEE ; Ahwon KIM ; Seok-Jin CHOI ; Eric CHO ; Jaeyoung SEO ; Seong-il OH ; Jinho JUNG ; Ji-Sun KIM ; Jung-Joon SUNG ; Sharon ABRAHAMS ; Yoon-Ho HONG
Journal of Clinical Neurology 2024;20(6):637-637
10.Molecular detection of Borrelia theileri in cattle in Korea
Hyeon-Ji HYUNG ; Yun-Sil CHOI ; Jinho PARK ; Kwang-Jun LEE ; Jun-Gu KANG
Parasites, Hosts and Diseases 2024;62(1):151-156
Bovine borreliosis, caused by Borrelia theileri which is transmitted via hard tick bites, is associated with mild clinical symptoms, such as fever, lethargy, hemoglobinuria, anorexia, and anemia. Borrelia theileri infects various animals, such as cattle, deer, horses, goats, sheep, and wild ruminants, in Africa, Australia, and South America. Notably, no case of B. theileri infection has been reported in Korean cattle to date. In this study, 101 blood samples were collected from a Korean indigenous cattle breed, among which 1.98% tested positive for B. theileri via nested PCR. The obtained sequences exhibited high homology with B. theileri strains identified in other regions. Phylogenetic analysis of 16S rRNA confirmed the B. theileri group affiliation; however, flagellin B sequences exhibited divergence, potentially due to regional evolutionary differences. This study provides the first molecular confirmation of B. theileri infection in Korean livestock. Further isolation and nucleotide sequence analyses are necessary to better understand the presence of B. theileri strains in cows in Korea.

Result Analysis
Print
Save
E-mail