1.Synthetic data production for biomedical research
Yun Gyeong LEE ; Mi-Sook KWAK ; Jeong Eun KIM ; Min Sun KIM ; Dong Un NO ; Hee Youl CHAI
Osong Public Health and Research Perspectives 2025;16(2):94-99
Synthetic data, generated using advanced artificial intelligence (AI) techniques, replicates the statistical properties of real-world datasets while excluding identifiable information.Although synthetic data does not consist of actual data points, it is derived from original datasets, thereby enabling analyses that yield results comparable to those obtained with real data. Synthetic datasets are evaluated based on their utility—a measure of how effectively they mirror real data for analytical purposes. This paper presents the generation of synthetic datasets through the Healthcare Big Data Showcase Project (2019–2023). The original dataset comprises comprehensive multi-omics data from 400 individuals, including cancer survivors, chronic disease patients, and healthy participants. Synthetic data facilitates efficient access and robust analyses, serving as a practical tool for research and education. It addresses privacy concerns, supports AI research, and provides a foundation for innovative applications across diverse fields, such as public health and precision medicine.
2.Synthetic data production for biomedical research
Yun Gyeong LEE ; Mi-Sook KWAK ; Jeong Eun KIM ; Min Sun KIM ; Dong Un NO ; Hee Youl CHAI
Osong Public Health and Research Perspectives 2025;16(2):94-99
Synthetic data, generated using advanced artificial intelligence (AI) techniques, replicates the statistical properties of real-world datasets while excluding identifiable information.Although synthetic data does not consist of actual data points, it is derived from original datasets, thereby enabling analyses that yield results comparable to those obtained with real data. Synthetic datasets are evaluated based on their utility—a measure of how effectively they mirror real data for analytical purposes. This paper presents the generation of synthetic datasets through the Healthcare Big Data Showcase Project (2019–2023). The original dataset comprises comprehensive multi-omics data from 400 individuals, including cancer survivors, chronic disease patients, and healthy participants. Synthetic data facilitates efficient access and robust analyses, serving as a practical tool for research and education. It addresses privacy concerns, supports AI research, and provides a foundation for innovative applications across diverse fields, such as public health and precision medicine.
3.Synthetic data production for biomedical research
Yun Gyeong LEE ; Mi-Sook KWAK ; Jeong Eun KIM ; Min Sun KIM ; Dong Un NO ; Hee Youl CHAI
Osong Public Health and Research Perspectives 2025;16(2):94-99
Synthetic data, generated using advanced artificial intelligence (AI) techniques, replicates the statistical properties of real-world datasets while excluding identifiable information.Although synthetic data does not consist of actual data points, it is derived from original datasets, thereby enabling analyses that yield results comparable to those obtained with real data. Synthetic datasets are evaluated based on their utility—a measure of how effectively they mirror real data for analytical purposes. This paper presents the generation of synthetic datasets through the Healthcare Big Data Showcase Project (2019–2023). The original dataset comprises comprehensive multi-omics data from 400 individuals, including cancer survivors, chronic disease patients, and healthy participants. Synthetic data facilitates efficient access and robust analyses, serving as a practical tool for research and education. It addresses privacy concerns, supports AI research, and provides a foundation for innovative applications across diverse fields, such as public health and precision medicine.
4.Synthetic data production for biomedical research
Yun Gyeong LEE ; Mi-Sook KWAK ; Jeong Eun KIM ; Min Sun KIM ; Dong Un NO ; Hee Youl CHAI
Osong Public Health and Research Perspectives 2025;16(2):94-99
Synthetic data, generated using advanced artificial intelligence (AI) techniques, replicates the statistical properties of real-world datasets while excluding identifiable information.Although synthetic data does not consist of actual data points, it is derived from original datasets, thereby enabling analyses that yield results comparable to those obtained with real data. Synthetic datasets are evaluated based on their utility—a measure of how effectively they mirror real data for analytical purposes. This paper presents the generation of synthetic datasets through the Healthcare Big Data Showcase Project (2019–2023). The original dataset comprises comprehensive multi-omics data from 400 individuals, including cancer survivors, chronic disease patients, and healthy participants. Synthetic data facilitates efficient access and robust analyses, serving as a practical tool for research and education. It addresses privacy concerns, supports AI research, and provides a foundation for innovative applications across diverse fields, such as public health and precision medicine.
5.Synthetic data production for biomedical research
Yun Gyeong LEE ; Mi-Sook KWAK ; Jeong Eun KIM ; Min Sun KIM ; Dong Un NO ; Hee Youl CHAI
Osong Public Health and Research Perspectives 2025;16(2):94-99
Synthetic data, generated using advanced artificial intelligence (AI) techniques, replicates the statistical properties of real-world datasets while excluding identifiable information.Although synthetic data does not consist of actual data points, it is derived from original datasets, thereby enabling analyses that yield results comparable to those obtained with real data. Synthetic datasets are evaluated based on their utility—a measure of how effectively they mirror real data for analytical purposes. This paper presents the generation of synthetic datasets through the Healthcare Big Data Showcase Project (2019–2023). The original dataset comprises comprehensive multi-omics data from 400 individuals, including cancer survivors, chronic disease patients, and healthy participants. Synthetic data facilitates efficient access and robust analyses, serving as a practical tool for research and education. It addresses privacy concerns, supports AI research, and provides a foundation for innovative applications across diverse fields, such as public health and precision medicine.
6.Primary Cutaneous CD30+ Lymphoproliferative Disorders in South Korea: A Nationwide, Multi-Center, Retrospective, Clinical, and Prognostic Study
Woo Jin LEE ; Sook Jung YUN ; Joon Min JUNG ; Joo Yeon KO ; Kwang Ho KIM ; Dong Hyun KIM ; Myung Hwa KIM ; You Chan KIM ; Jung Eun KIM ; Chan-Ho NA ; Je-Ho MUN ; Jong Bin PARK ; Ji-Hye PARK ; Hai-Jin PARK ; Dong Hoon SHIN ; Jeonghyun SHIN ; Sang Ho OH ; Seok-Kweon YUN ; Dongyoun LEE ; Seok-Jong LEE ; Seung Ho LEE ; Young Bok LEE ; Soyun CHO ; Sooyeon CHOI ; Jae Eun CHOI ; Mi Woo LEE ; On behalf of The Korean Society of Dermatopathology
Annals of Dermatology 2025;37(2):75-85
Background:
Primary cutaneous CD30+ lymphoproliferative disorders (pcCD30-LPDs) are a diseases with various clinical and prognostic characteristics.
Objective:
Increasing our knowledge of the clinical characteristics of pcCD30-LPDs and identifying potential prognostic variables in an Asian population.
Methods:
Clinicopathological features and survival data of pcCD30-LPD cases obtained from 22 hospitals in South Korea were examined.
Results:
A total of 413 cases of pcCD30-LPDs (lymphomatoid papulosis [LYP], n=237; primary cutaneous anaplastic large cell lymphoma [C-ALCL], n=176) were included. Ninety percent of LYP patients and roughly 50% of C-ALCL patients presented with multiple skin lesions. Both LYP and C-ALCL affected the lower limbs most frequently. Multiplicity and advanced T stage of LYP lesions were associated with a chronic course longer than 6 months. Clinical morphology with patch lesions and elevated serum lactate dehydrogenase were significantly associated with LPDs during follow-up in LYP patients. Extracutaneous involvement of C-ALCL occurred in 13.2% of patients. Lesions larger than 5 cm and increased serum lactate dehydrogenase were associated with a poor prognosis in C-ALCL. The survival of patients with C-ALCL was unaffected by the anatomical locations of skin lesions or other pathological factors.
Conclusion
The multiplicity or size of skin lesions was associated with a chronic course of LYP and survival among patients with C-ALCL.
7.Difference in Baseline Antimicrobial Prescription Patterns of Hospitals According to Participation in the National Antimicrobial Monitoring and Feedback System in Korea
Jihye SHIN ; Ji Young PARK ; Jungmi CHAE ; Hyung-Sook KIM ; Song Mi MOON ; Eunjeong HEO ; Se Yoon PARK ; Dong Min SEO ; Ha-Jin CHUN ; Yong Chan KIM ; Myung Jin LEE ; Kyungmin HUH ; Hyo Jung PARK ; I Ji YUN ; Su Jin JEONG ; Jun Yong CHOI ; Dong-Sook KIM ; Bongyoung KIM ;
Journal of Korean Medical Science 2024;39(29):e216-
This study aimed to evaluate the differences in the baseline characteristics and patterns of antibiotic usage among hospitals based on their participation in the Korea National Antimicrobial Use Analysis System (KONAS). We obtained claims data from the National Health Insurance for inpatients admitted to all secondary- and tertiary-care hospitals between January 2020 and December 2021 in Korea. 15.9% (58/395) of hospitals were KONAS participants, among which the proportion of hospitals with > 900 beds (31.0% vs.2.6%, P < 0.001) and tertiary care (50.0% vs. 5.2%, P < 0.001) was higher than that among non-participants. The consumption of antibiotics targeting antimicrobial-resistant gram positive bacteria (33.7 vs. 27.1 days of therapy [DOT]/1,000 patient-days, P = 0.019) and antibiotics predominantly used for resistant gram-negative bacteria (4.8 vs. 3.7 DOT/1,000 patient-days, P = 0.034) was higher in KONAS-participating versus -non-participating hospitals. The current KONAS data do not fully represent all secondary- and tertiary-care hospitals in Korea; thus, the KONAS results should be interpreted with caution.
8.Lazertinib versus Gefitinib as First-Line Treatment for EGFR-mutated Locally Advanced or Metastatic NSCLC: LASER301 Korean Subset
Ki Hyeong LEE ; Byoung Chul CHO ; Myung-Ju AHN ; Yun-Gyoo LEE ; Youngjoo LEE ; Jong-Seok LEE ; Joo-Hang KIM ; Young Joo MIN ; Gyeong-Won LEE ; Sung Sook LEE ; Kyung-Hee LEE ; Yoon Ho KO ; Byoung Yong SHIM ; Sang-We KIM ; Sang Won SHIN ; Jin-Hyuk CHOI ; Dong-Wan KIM ; Eun Kyung CHO ; Keon Uk PARK ; Jin-Soo KIM ; Sang Hoon CHUN ; Jangyoung WANG ; SeokYoung CHOI ; Jin Hyoung KANG
Cancer Research and Treatment 2024;56(1):48-60
Purpose:
This subgroup analysis of the Korean subset of patients in the phase 3 LASER301 trial evaluated the efficacy and safety of lazertinib versus gefitinib as first-line therapy for epidermal growth factor receptor mutated (EGFRm) non–small cell lung cancer (NSCLC).
Materials and Methods:
Patients with locally advanced or metastatic EGFRm NSCLC were randomized 1:1 to lazertinib (240 mg/day) or gefitinib (250 mg/day). The primary endpoint was investigator-assessed progression-free survival (PFS).
Results:
In total, 172 Korean patients were enrolled (lazertinib, n=87; gefitinib, n=85). Baseline characteristics were balanced between the treatment groups. One-third of patients had brain metastases (BM) at baseline. Median PFS was 20.8 months (95% confidence interval [CI], 16.7 to 26.1) for lazertinib and 9.6 months (95% CI, 8.2 to 12.3) for gefitinib (hazard ratio [HR], 0.41; 95% CI, 0.28 to 0.60). This was supported by PFS analysis based on blinded independent central review. Significant PFS benefit with lazertinib was consistently observed across predefined subgroups, including patients with BM (HR, 0.28; 95% CI, 0.15 to 0.53) and those with L858R mutations (HR, 0.36; 95% CI, 0.20 to 0.63). Lazertinib safety data were consistent with its previously reported safety profile. Common adverse events (AEs) in both groups included rash, pruritus, and diarrhoea. Numerically fewer severe AEs and severe treatment–related AEs occurred with lazertinib than gefitinib.
Conclusion
Consistent with results for the overall LASER301 population, this analysis showed significant PFS benefit with lazertinib versus gefitinib with comparable safety in Korean patients with untreated EGFRm NSCLC, supporting lazertinib as a new potential treatment option for this patient population.
9.Clinical Practice Recommendations for the Use of Next-Generation Sequencing in Patients with Solid Cancer: A Joint Report from KSMO and KSP
Miso KIM ; Hyo Sup SHIM ; Sheehyun KIM ; In Hee LEE ; Jihun KIM ; Shinkyo YOON ; Hyung-Don KIM ; Inkeun PARK ; Jae Ho JEONG ; Changhoon YOO ; Jaekyung CHEON ; In-Ho KIM ; Jieun LEE ; Sook Hee HONG ; Sehhoon PARK ; Hyun Ae JUNG ; Jin Won KIM ; Han Jo KIM ; Yongjun CHA ; Sun Min LIM ; Han Sang KIM ; Choong-kun LEE ; Jee Hung KIM ; Sang Hoon CHUN ; Jina YUN ; So Yeon PARK ; Hye Seung LEE ; Yong Mee CHO ; Soo Jeong NAM ; Kiyong NA ; Sun Och YOON ; Ahwon LEE ; Kee-Taek JANG ; Hongseok YUN ; Sungyoung LEE ; Jee Hyun KIM ; Wan-Seop KIM
Cancer Research and Treatment 2024;56(3):721-742
In recent years, next-generation sequencing (NGS)–based genetic testing has become crucial in cancer care. While its primary objective is to identify actionable genetic alterations to guide treatment decisions, its scope has broadened to encompass aiding in pathological diagnosis and exploring resistance mechanisms. With the ongoing expansion in NGS application and reliance, a compelling necessity arises for expert consensus on its application in solid cancers. To address this demand, the forthcoming recommendations not only provide pragmatic guidance for the clinical use of NGS but also systematically classify actionable genes based on specific cancer types. Additionally, these recommendations will incorporate expert perspectives on crucial biomarkers, ensuring informed decisions regarding circulating tumor DNA panel testing.
10.A 10-Gene Signature to Predict the Prognosis of Early-Stage Triple-Negative Breast Cancer
Chang Min KIM ; Kyong Hwa PARK ; Yun Suk YU ; Ju Won KIM ; Jin Young PARK ; Kyunghee PARK ; Jong-Han YU ; Jeong Eon LEE ; Sung Hoon SIM ; Bo Kyoung SEO ; Jin Kyeoung KIM ; Eun Sook LEE ; Yeon Hee PARK ; Sun-Young KONG
Cancer Research and Treatment 2024;56(4):1113-1125
Purpose:
Triple-negative breast cancer (TNBC) is a particularly challenging subtype of breast cancer, with a poorer prognosis compared to other subtypes. Unfortunately, unlike luminal-type cancers, there is no validated biomarker to predict the prognosis of patients with early-stage TNBC. Accurate biomarkers are needed to establish effective therapeutic strategies.
Materials and Methods:
In this study, we analyzed gene expression profiles of tumor samples from 184 TNBC patients (training cohort, n=76; validation cohort, n=108) using RNA sequencing.
Results:
By combining weighted gene expression, we identified a 10-gene signature (DGKH, GADD45B, KLF7, LYST, NR6A1, PYCARD, ROBO1, SLC22A20P, SLC24A3, and SLC45A4) that stratified patients by risk score with high sensitivity (92.31%), specificity (92.06%), and accuracy (92.11%) for invasive disease-free survival. The 10-gene signature was validated in a separate institution cohort and supported by meta-analysis for biological relevance to well-known driving pathways in TNBC. Furthermore, the 10-gene signature was the only independent factor for invasive disease-free survival in multivariate analysis when compared to other potential biomarkers of TNBC molecular subtypes and T-cell receptor β diversity. 10-gene signature also further categorized patients classified as molecular subtypes according to risk scores.
Conclusion
Our novel findings may help address the prognostic challenges in TNBC and the 10-gene signature could serve as a novel biomarker for risk-based patient care.

Result Analysis
Print
Save
E-mail