1.Advancing Korean Medical Large Language Models: Automated Pipeline for Korean Medical Preference Dataset Construction
Jean SEO ; Sumin PARK ; Sungjoo BYUN ; Jinwook CHOI ; Jinho CHOI ; Hyopil SHIN
Healthcare Informatics Research 2025;31(2):166-174
Objectives:
Developing large language models (LLMs) in biomedicine requires access to high-quality training and alignment tuning datasets. However, publicly available Korean medical preference datasets are scarce, hindering the advancement of Korean medical LLMs. This study constructs and evaluates the efficacy of the Korean Medical Preference Dataset (KoMeP), an alignment tuning dataset constructed with an automated pipeline, minimizing the high costs of human annotation.
Methods:
KoMeP was generated using the DAHL score, an automated hallucination evaluation metric. Five LLMs (Dolly-v2-3B, MPT-7B, GPT-4o, Qwen-2-7B, Llama-3-8B) produced responses to 8,573 biomedical examination questions, from which 5,551 preference pairs were extracted. Each pair consisted of a “chosen” response and a “rejected” response, as determined by their DAHL scores. The dataset was evaluated when trained through two different alignment tuning methods, direct preference optimization (DPO) and odds ratio preference optimization (ORPO) respectively across five different models. The KorMedMCQA benchmark was employed to assess the effectiveness of alignment tuning.
Results:
Models trained with DPO consistently improved KorMedMCQA performance; notably, Llama-3.1-8B showed a 43.96% increase. In contrast, ORPO training produced inconsistent results. Additionally, English-to-Korean transfer learning proved effective, particularly for English-centric models like Gemma-2, whereas Korean-to-English transfer learning achieved limited success. Instruction tuning with KoMeP yielded mixed outcomes, which suggests challenges in dataset formatting.
Conclusions
KoMeP is the first publicly available Korean medical preference dataset and significantly improves alignment tuning performance in LLMs. The DPO method outperforms ORPO in alignment tuning. Future work should focus on expanding KoMeP, developing a Korean-native dataset, and refining alignment tuning methods to produce safer and more reliable Korean medical LLMs.
2.Advancing Korean Medical Large Language Models: Automated Pipeline for Korean Medical Preference Dataset Construction
Jean SEO ; Sumin PARK ; Sungjoo BYUN ; Jinwook CHOI ; Jinho CHOI ; Hyopil SHIN
Healthcare Informatics Research 2025;31(2):166-174
Objectives:
Developing large language models (LLMs) in biomedicine requires access to high-quality training and alignment tuning datasets. However, publicly available Korean medical preference datasets are scarce, hindering the advancement of Korean medical LLMs. This study constructs and evaluates the efficacy of the Korean Medical Preference Dataset (KoMeP), an alignment tuning dataset constructed with an automated pipeline, minimizing the high costs of human annotation.
Methods:
KoMeP was generated using the DAHL score, an automated hallucination evaluation metric. Five LLMs (Dolly-v2-3B, MPT-7B, GPT-4o, Qwen-2-7B, Llama-3-8B) produced responses to 8,573 biomedical examination questions, from which 5,551 preference pairs were extracted. Each pair consisted of a “chosen” response and a “rejected” response, as determined by their DAHL scores. The dataset was evaluated when trained through two different alignment tuning methods, direct preference optimization (DPO) and odds ratio preference optimization (ORPO) respectively across five different models. The KorMedMCQA benchmark was employed to assess the effectiveness of alignment tuning.
Results:
Models trained with DPO consistently improved KorMedMCQA performance; notably, Llama-3.1-8B showed a 43.96% increase. In contrast, ORPO training produced inconsistent results. Additionally, English-to-Korean transfer learning proved effective, particularly for English-centric models like Gemma-2, whereas Korean-to-English transfer learning achieved limited success. Instruction tuning with KoMeP yielded mixed outcomes, which suggests challenges in dataset formatting.
Conclusions
KoMeP is the first publicly available Korean medical preference dataset and significantly improves alignment tuning performance in LLMs. The DPO method outperforms ORPO in alignment tuning. Future work should focus on expanding KoMeP, developing a Korean-native dataset, and refining alignment tuning methods to produce safer and more reliable Korean medical LLMs.
3.Advancing Korean Medical Large Language Models: Automated Pipeline for Korean Medical Preference Dataset Construction
Jean SEO ; Sumin PARK ; Sungjoo BYUN ; Jinwook CHOI ; Jinho CHOI ; Hyopil SHIN
Healthcare Informatics Research 2025;31(2):166-174
Objectives:
Developing large language models (LLMs) in biomedicine requires access to high-quality training and alignment tuning datasets. However, publicly available Korean medical preference datasets are scarce, hindering the advancement of Korean medical LLMs. This study constructs and evaluates the efficacy of the Korean Medical Preference Dataset (KoMeP), an alignment tuning dataset constructed with an automated pipeline, minimizing the high costs of human annotation.
Methods:
KoMeP was generated using the DAHL score, an automated hallucination evaluation metric. Five LLMs (Dolly-v2-3B, MPT-7B, GPT-4o, Qwen-2-7B, Llama-3-8B) produced responses to 8,573 biomedical examination questions, from which 5,551 preference pairs were extracted. Each pair consisted of a “chosen” response and a “rejected” response, as determined by their DAHL scores. The dataset was evaluated when trained through two different alignment tuning methods, direct preference optimization (DPO) and odds ratio preference optimization (ORPO) respectively across five different models. The KorMedMCQA benchmark was employed to assess the effectiveness of alignment tuning.
Results:
Models trained with DPO consistently improved KorMedMCQA performance; notably, Llama-3.1-8B showed a 43.96% increase. In contrast, ORPO training produced inconsistent results. Additionally, English-to-Korean transfer learning proved effective, particularly for English-centric models like Gemma-2, whereas Korean-to-English transfer learning achieved limited success. Instruction tuning with KoMeP yielded mixed outcomes, which suggests challenges in dataset formatting.
Conclusions
KoMeP is the first publicly available Korean medical preference dataset and significantly improves alignment tuning performance in LLMs. The DPO method outperforms ORPO in alignment tuning. Future work should focus on expanding KoMeP, developing a Korean-native dataset, and refining alignment tuning methods to produce safer and more reliable Korean medical LLMs.
4.NF-kappa B activation following optic nerve transection.
Jun Sub CHOI ; Sungjoo KIM-YOON ; Choun Ki JOO
Korean Journal of Ophthalmology 1998;12(1):19-24
In order to elucidate in vivo neuronal cell death in the retina, and involvement of NF-kappa B in this process, we studied the degeneration of retinal ganglion cells (RGCs) and the activation of NF-kappa B after transection of the optic nerve of adult rat at 5 mm from the eyeball. The morphology of dying ganglion cells in the retinal ganglion cell layer was observed by light and electron microscopy, the activation of NF-kappa B was investigated immunohistochemically. Seven and 14 days post-axotomy, dying cells contained pyknotic nuclei. The death of retinal ganglion cells involved apoptosis, activation of NF-kappa B (p50 and p65) was prominent in a time dependent manner. We observed axotomy-induced NF-kappa B activation, which may mediate apoptosis of retinal ganglion cells.
Animal
;
Apoptosis/physiology
;
Axotomy
;
Immunohistochemistry
;
Male
;
Microscopy, Electron
;
NF-kappa B/biosynthesis*
;
Optic Nerve/surgery
;
Rats
;
Rats, Sprague-Dawley
;
Retinal Ganglion Cells/ultrastructure
;
Retinal Ganglion Cells/pathology*
;
Retinal Ganglion Cells/metabolism
;
Substances: NF-kappa B
5.The Effect of Cigarette Price on Smoking Behavior in Korea.
Woojin CHUNG ; Seungji LIM ; Sunmi LEE ; Sungjoo CHOI ; Kayoung SHIN ; Kyungsook CHO
Journal of Preventive Medicine and Public Health 2007;40(5):371-380
OBJECTIVES: To determine the impact of cigarette prices on the decision to initiate and quit smoking by taking into account the interdependence of smoking and other behavioral risk factors. METHODS: The study population consisted of 3,000 male Koreans aged > or =20. A survey by telephone interview was undertaken to collect information on cigarette price, smoking and other behavioral risk factors. A two-part model was used to examine separately the effect of price on the decision to be a smoker, and on the amount of cigarettes smoked. RESULTS: The overall price elasticity of cigarettes was estimated at -0.66, with a price elasticity of -0.02 for smoking participation and -0.64 for the amount of cigarettes consumed by smokers. The inclusion of other behavioral risk factors reduced the estimated price elasticity for smoking participation substantially, but had no effect on the conditional price elasticity for the quantity of cigarettes smoked. CONCLUSIONS: From the public health and financial perspectives, an increase in cigarette price would significantly reduce smoking prevalence as well as cigarette consumption by smokers in Korea.
Adult
;
*Costs and Cost Analysis
;
Health Behavior
;
Humans
;
Korea/epidemiology
;
Male
;
Middle Aged
;
Risk Factors
;
Smoking/*economics/*prevention & control
;
Social Environment
;
Socioeconomic Factors
;
*Tobacco
6.Efficacy of Enhanced MRI in Epidural Varix: Report of Six Cases.
Shinkwon CHOI ; Kyang Yul KIM ; Sungjoo LEE ; Sunghwan YOON
Journal of Korean Society of Spine Surgery 2006;13(3):210-214
Symptomatic epidural varix presenting with radiculopathy is extremely rare. The most common misdiagnosis is reported as a sequestrated prolapsed nucleus pulposus in the preoperative evaluation. The method of evaluating enhanced MRI studies improved the efficacy of discovery and treatment of this condition. We experienced 6 cases of epidural varices that were diagnosed with T1 fat suppressed post-gadolinium enhanced MRI studies and we present the operative findings.
Diagnostic Errors
;
Magnetic Resonance Imaging*
;
Radiculopathy
;
Varicose Veins*
7.Effect of MK801 and CNQX on Retinal Injury Induced by Ischemia, NMDA, or Kainate.
Jun Sub CHOI ; Byung Joo GWAG ; SungJoo Kim YOON ; Choun Ki JOO
Journal of the Korean Ophthalmological Society 1998;39(8):1794-1800
To examine the protection of retinal cell death by glutamate antagonists in vivo, this study was carried out in pressure-induced ischemia model. Firstly, we observed that ischemia resulted in the similar retinaldamage to the injuries caused by NMAD and Kainate toxicity. Secondly, the retinal cell death caused by ischemia was prevented by MK801 and CNQX, glutamate antagonists for NMDA and Kainate excitotoxicity, respectively at 24hr after ischemia. MK801 was shown to prevent the cell death in ganglion cell layer and CNQX in inner unclear layer. In addition, the combination of CNQX and MK801 protected the retina neuronal cell from ischemic injury better than when they were applied separately. The partial protection of retinal cell death by glutamate antagonists in ischemia model indicates that glutamate eoxicity as well as other cell death mechanism such as apoptosis mediates ischemia induced retinal cell death. Thus, cell death by other mechanism must be also blocked in order to prevent retinal cell death, completely.
6-Cyano-7-nitroquinoxaline-2,3-dione*
;
Apoptosis
;
Cell Death
;
Dizocilpine Maleate*
;
Excitatory Amino Acid Antagonists
;
Ganglion Cysts
;
Glutamic Acid
;
Ischemia*
;
Kainic Acid*
;
N-Methylaspartate*
;
Neurons
;
Retina
;
Retinaldehyde*
8.Effects of Very Low Calorie Diet using Meal Replacements on Weight Reduction and Health in the Obese Adult Women.
Jiyoung KIM ; Sangyeon KIM ; Kyung Ah JUNG ; Yukyung CHANG ; Hyeongsuk CHOI ; Sung CHOI ; Mihyeon PARK ; Seonggil HONG ; Sungjoo HWANG
The Korean Journal of Nutrition 2005;38(9):739-749
This study was performed to investigate the effects of very low calorie diet (VLCD) using newly meal replacements that contain the wild grass extracts based on Samul-tang ingredients on weight reduction and health in the obese adult women (BMI > or = 25 kg/m2) for four weeks. Seventy five women participated in this experiment. Subjects were randomly classified three groups: 1) General Diet group (GD group, n = 25) consumed 3 regular meals within 600 kcal/day, 2) Meal replacements group (MR group, n = 25) consumed 1 regular meal and 2 meal replacements within 600 kcal/ day, 3) Herbal Meal replacements group (HMR group, n = 25) consumed 1 regular meal and 2 meal replacements within 600 kcal/day. Anthropometric measurements, body composition, biochemical measurements and body symptoms were assessed before (the initial) and after (the 4th week) the study. Anthropometry measurements such as weight, waist and hip circumference, and BMI and body composition such as body fat percent, fat mass significantly decreased in all groups after diet intervention. Anthropometric measurements and body composition of the HMR group significantly more than those of GD and MR groups. Serum Total cholesterol was significantly decreased in all groups. However, there was no significant difference among three groups during the experimental period. HMR group had significantly less felt a pain than GD and MR groups in body symptoms such as anemia, powerlessness, vomiting, constipation and dryness of skin during the experimental period. Therefore, very low calorie diet (VLCD) using meal replacements that contain the wild grass extracts based on Samul-tang ingredients was very effective on weight reduction and health in the obese adult women.
Adipose Tissue
;
Adult*
;
Anemia
;
Anthropometry
;
Body Composition
;
Caloric Restriction*
;
Cholesterol
;
Constipation
;
Diet
;
Female
;
Hip
;
Humans
;
Meals*
;
Poaceae
;
Skin
;
Vomiting
;
Weight Loss*
9.Effects of Very Low Calorie Diet using Meal Replacements on Psychological Factors and Quality of Life in the Obese Women Aged Twenties.
Jiyoung KIM ; Sangyeon KIM ; Kyunga JUNG ; Yukyung CHANG ; Hyeongsuk CHOI ; Sung CHOI ; Mihyeon PARK ; Seonggil HONG ; Sungjoo HWANG
The Korean Journal of Nutrition 2007;40(7):639-649
This study was performed to investigate the effects of very low calorie diet (VLCD) using meal replacements that contain the wild grass extracts based on Samul-tang ingredients on psychological factors and quality of life in the obese women (BMI > or = 25 kg/m2) for four weeks. Seventy five women (20 < or = age < 26) participated in this experiment. Subjects were randomly classified three groups: 1) General diet group (GD group, n = 27) consumed 3 regular meals within 600 kcal/day 2) Meal replacements group (MR group, n = 27) consumed 1 regular meal and 2 meal replacements within 600 kcal/day 3) Herbal Meal replacements group (HMR group, n = 27) consumed 1 regular meal and 2 meal replacements within 600 kcal/day. Physical factors (weight, BMI, fat(%)) of the HMR group significantly decreased more than those of GD and MR groups. Moreover, binge eating habit and environmental factors (surrounding support, emotional reaction, expression of opinion) of the HMR group significantly decreased more than those of GD and MR groups. Psychological factor and quality of life were no significant differences among three groups during the experimental period, because both were significantly decreased in all groups after 4 weeks. Therefore, very low calorie diet using meal replacements that contain the wild grass extracts based on Samul-tang ingredients for 4 weeks was effective on improvement of psychological factor and quality of life as well as weight reduction in the obese premenopausal women.
Bulimia
;
Caloric Restriction*
;
Diet
;
Female
;
Humans
;
Meals*
;
Poaceae
;
Psychology*
;
Quality of Life*
;
Weight Loss
10.Single Nucleotide Deletion Mutation of KCNH2 Gene is Responsible for LQT Syndrome in a 3-Generation Korean Family.
Jong Keun PARK ; Yong Seog OH ; Jee Hyun CHOI ; Sungjoo Kim YOON
Journal of Korean Medical Science 2013;28(9):1388-1393
Long QT syndrome (LQTS) is characterized by the prolongation of the QT interval in ECG and manifests predisposition to life threatening arrhythmia which often leads to sudden cardiac death. We encountered a 3-generation family with 5 affected family members in which LQTS was inherited in autosomal dominant manner. The LQTS is considered an ion channel disorder in which the type and location of the genetic mutation determines to a large extent the expression of the clinical syndrome. Upon screening of the genomic sequences of cardiac potassium ion channel genes, we found a single nucleotide C deletion mutation in the exon 3 of KCNH2 gene that co-segregates with the LQTS in this family. This mutation presumably resulted in a frameshift mutation, P151fs+15X. This study added a new genetic cause to the pool of mutations that lead to defected potassium ion channels in the heart.
Adolescent
;
Adult
;
Aged
;
Aged, 80 and over
;
Asian Continental Ancestry Group/*genetics
;
DNA Mutational Analysis
;
Ether-A-Go-Go Potassium Channels/*genetics
;
Exons
;
Female
;
Frameshift Mutation
;
Genotype
;
Humans
;
Long QT Syndrome/*diagnosis/genetics
;
Male
;
Middle Aged
;
Pedigree
;
Republic of Korea
;
Sequence Deletion