1.Synthetic data production for biomedical research
Yun Gyeong LEE ; Mi-Sook KWAK ; Jeong Eun KIM ; Min Sun KIM ; Dong Un NO ; Hee Youl CHAI
Osong Public Health and Research Perspectives 2025;16(2):94-99
Synthetic data, generated using advanced artificial intelligence (AI) techniques, replicates the statistical properties of real-world datasets while excluding identifiable information.Although synthetic data does not consist of actual data points, it is derived from original datasets, thereby enabling analyses that yield results comparable to those obtained with real data. Synthetic datasets are evaluated based on their utility—a measure of how effectively they mirror real data for analytical purposes. This paper presents the generation of synthetic datasets through the Healthcare Big Data Showcase Project (2019–2023). The original dataset comprises comprehensive multi-omics data from 400 individuals, including cancer survivors, chronic disease patients, and healthy participants. Synthetic data facilitates efficient access and robust analyses, serving as a practical tool for research and education. It addresses privacy concerns, supports AI research, and provides a foundation for innovative applications across diverse fields, such as public health and precision medicine.
2.Pre-Treatment Perceived Social Support Is Associated With Chemotherapy-Induced Peripheral Neuropathy in Patients With Breast Cancer: A Longitudinal Study
Joon Sung SHIN ; Sanghyup JUNG ; Geun Hui WON ; Sun Hyung LEE ; Jaehyun KIM ; Saim JUNG ; Chan-Woo YEOM ; Kwang-Min LEE ; Kyung-Lak SON ; Jang-il KIM ; Sook Young JEON ; Han-Byoel LEE ; Bong-Jin HAHM
Psychiatry Investigation 2025;22(4):424-434
Objective:
Previous studies have reported an association between cancer-related symptoms and perceived social support (PSS). The objective of this study was to analyze whether Chemotherapy-Induced Peripheral Neuropathy (CIPN), a prevalent side effect of chemotherapy, varies according to PSS level using a validated tool for CIPN at prospective follow-up.
Methods:
A total of 39 breast cancer patients were evaluated for PSS using the Multidimensional Scale of Perceived Social Support (MSPSS) prior to chemotherapy and were subsequently grouped into one of two categories for each subscale: low-to-moderate PSS and high PSS. CIPN was prospectively evaluated using the European Organization for Research and Treatment of Cancer Quality of Life Questionnaire-Chemotherapy-Induced Peripheral Neuropathy 20 (CIPN20) at five time points. A linear mixed-effects model with square root transformation was employed to investigate whether the CIPN20 scales varied by PSS level and time point.
Results:
Statistical analysis of the MSPSS total scale and subscales revealed a significant effect of the friends subscale group and time point on the CIPN20 sensory scale. The sensory scale score of CIPN20 was found to be lower in participants with high PSS from friends in comparison to those with low-to-moderate PSS at 1 month post-chemotherapy (p=0.010).
Conclusion
This is the first study to prospectively follow the long-term effect of pre-treatment PSS from friends on CIPN. Further studies based on larger samples are required to analyze the effects of PSS on the pathophysiology of CIPN.
3.Development and Application of New Risk-Adjustment Models to Improve the Current Model for Hospital Standardized Mortality Ratio in South Korea
Hyeki PARK ; Ji-Sook CHOI ; Min Sun SHIN ; Soomin KIM ; Hyekyoung KIM ; Nahyeong IM ; Soon Joo PARK ; Donggyo SHIN ; Youngmi SONG ; Yunjung CHO ; Hyunmi JOO ; Hyeryeon HONG ; Yong-Hwa HWANG ; Choon-Seon PARK
Yonsei Medical Journal 2025;66(3):179-186
Purpose:
This study assessed the validity of the hospital standardized mortality ratio (HSMR) risk-adjusted model by comparing models that include clinical information and the current model based on administrative information in South Korea.
Materials and Methods:
The data of 53976 inpatients were analyzed. The current HSMR risk-adjusted model (Model 1) adjusts for sex, age, health coverage, emergency hospitalization status, main diagnosis, surgery status, and Charlson Comorbidity Index (CCI) using administrative data. As candidate variables, among clinical information, the American Society of Anesthesiologists score, Acute Physiology and Chronic Health Evaluation (APACHE) II, Simplified Acute Physiology Score (SAPS) 3, present on admission CCI, and cancer stage were collected. Surgery status, intensive care in the intensive care unit, and CCI were selected as proxy variables among administrative data. In-hospital death was defined as the dependent variable, and a logistic regression analysis was performed. The statistical performance of each model was compared using C-index values.
Results:
There was a strong correlation between variables in the administrative data and those in the medical records. The C-index of the existing model (Model 1) was 0.785; Model 2, which included all clinical data, had a higher C-index of 0.857. In Model 4, in which APACHE II and SAPS 3 were replaced with variables recorded in the administrative data from Model 2, the C-index further increased to 0.863.
Conclusion
The HSMR assessment model improved when clinical data were adjusted. Simultaneously, the validity of the evaluation method could be secured even if some of the clinical information was replaced with the information in the administrative data.
4.Professional biobanking education in Korea based on ISO 20387
Jong Ok KIM ; Chungyeul KIM ; Sangyong SONG ; Eunah SHIN ; Ji-Sun SONG ; Mee Sook ROH ; Dong-chul KIM ; Han-Kyeom KIM ; Joon Mee KIM ; Yeong Jin CHOI
Journal of Pathology and Translational Medicine 2025;59(1):11-25
To ensure high-quality bioresources and standardize biobanks, there is an urgent need to develop and disseminate educational training programs in accordance with ISO 20387, which was developed in 2018. The standardization of biobank education programs is also required to train biobank experts. The subdivision of categories and levels of education is necessary for jobs such as operations manager (bank president), quality manager, practitioner, and administrator. Essential training includes programs tailored for beginner, intermediate, and advanced practitioners, along with customized training for operations managers. We reviewed and studied ways to develop an appropriate range of education and training opportunities for standard biobanking education and the training of experts based on KS J ISO 20387. We propose more systematic and professional biobanking training programs in accordance with ISO 20387, in addition to the certification programs of the National Biobank and the Korean Laboratory Accreditation System. We suggest various training programs appropriate to a student’s affiliation or work, such as university biobanking specialized education, short-term job training at unit biobanks, biobank research institute symposiums by the Korean Society of Pathologists, and education programs for biobankers and researchers. Through these various education programs, we expect that Korean biobanks will satisfy global standards, meet the needs of users and researchers, and contribute to the advancement of science.
5.Risk of non-cancer respiratory diseases attributed to humidifier disinfectant exposure in Koreans: age-period-cohort and differences-in-difference analyses
Jaiyong KIM ; Kyoung Sook JEONG ; Seungyeon HEO ; Younghee KIM ; Jungyun LIM ; Sol YU ; Suejin KIM ; Sun-Kyoung SHIN ; Hae-Kwan CHEONG ; Mina HA ;
Epidemiology and Health 2025;47(1):e2025006-
OBJECTIVES:
Humidifier disinfectants (HDs) were sold in Korea from 1994 until their recall in 2011. We examined the incidence patterns of 8 respiratory diseases before and after the HD recall and estimated the attributable risk in the Korean population.
METHODS:
Using National Health Insurance data from 2002 to 2019, we performed age–cohort–period and differences-in-diffference analyses (comparing periods before vs. after the recall) to estimate the population-attributable fraction and the excess number of episodes. The database comprised 51 million individuals (99% of the Korean population). The incidence of 8 diseases—acute upper respiratory infection (AURI), acute lower respiratory infection (ALRI), asthma, pneumonia, chronic sinusitis (CS), interstitial lung disease (ILD), bronchiectasis, and chronic obstructive pulmonary disease (COPD)—was defined by constructing episodes of care based on patterns of medical care and the clinical characteristics of each disease.
RESULTS:
The relative risks (RRs) for AURI, ALRI, asthma, pneumonia, CS, and ILD were elevated among younger individuals (with an RR as high as 82.18 for AURI in males), whereas chronic conditions such as bronchiectasis, COPD, and ILD showed higher RRs in older individuals. During the HD exposure period, the population-attributable risk percentage ranged from 4.6% for bronchiectasis to 25.1% for pneumonia, with the excess number of episodes ranging from 6,218 for ILD to 3,058,861 for CS. Notably, females of reproductive age (19-44 years) experienced 1.1-9.2 times more excess episodes than males.
CONCLUSIONS
This study provides epidemiological evidence that inhalation exposure to HDs affects the entire respiratory tract and identifies vulnerable groups.
6.Role of HIF-1α in the Responses of Tumors to Radiotherapy and Chemotherapy
Chang W SONG ; Hyunkyung KIM ; Mi-Sook KIM ; Heon J PARK ; Sun-Ha PAEK ; Stephanie TEREZAKIS ; L Chinsoo CHO
Cancer Research and Treatment 2025;57(1):1-10
Tumor microenvironment is intrinsically hypoxic with abundant hypoxia-inducible factors-1α (HIF-1α), a primary regulator of the cellular response to hypoxia and various stresses imposed on the tumor cells. HIF-1α increases radioresistance and chemoresistance by reducing DNA damage, increasing repair of DNA damage, enhancing glycolysis that increases antioxidant capacity of tumors cells, and promoting angiogenesis. In addition, HIF-1α markedly enhances drug efflux, leading to multidrug resistance. Radiotherapy and certain chemotherapy drugs evoke profound anti-tumor immunity by inducing immunologic cell death that release tumor-associated antigens together with numerous pro-immunological factors, leading to priming of cytotoxic CD8+ T cells and enhancing the cytotoxicity of macrophages and natural killer cells. Radiotherapy and chemotherapy of tumors significantly increase HIF-1α activity in tumor cells. Unfortunately, HIF-1α effectively promotes various immune suppressive pathways including secretion of immune suppressive cytokines, activation of myeloid-derived suppressor cells, activation of regulatory T cells, inhibition of T cells priming and activity, and upregulation of immune checkpoints. Consequently, the anti-tumor immunity elevated by radiotherapy and chemotherapy is counterbalanced or masked by the potent immune suppression promoted by HIF-1α. Effective inhibition of HIF-1α may significantly increase the efficacy of radiotherapy and chemotherapy by increasing radiosensitivity and chemosensitivity of tumor cells and also by upregulating anti-tumor immunity.
7.Synthetic data production for biomedical research
Yun Gyeong LEE ; Mi-Sook KWAK ; Jeong Eun KIM ; Min Sun KIM ; Dong Un NO ; Hee Youl CHAI
Osong Public Health and Research Perspectives 2025;16(2):94-99
Synthetic data, generated using advanced artificial intelligence (AI) techniques, replicates the statistical properties of real-world datasets while excluding identifiable information.Although synthetic data does not consist of actual data points, it is derived from original datasets, thereby enabling analyses that yield results comparable to those obtained with real data. Synthetic datasets are evaluated based on their utility—a measure of how effectively they mirror real data for analytical purposes. This paper presents the generation of synthetic datasets through the Healthcare Big Data Showcase Project (2019–2023). The original dataset comprises comprehensive multi-omics data from 400 individuals, including cancer survivors, chronic disease patients, and healthy participants. Synthetic data facilitates efficient access and robust analyses, serving as a practical tool for research and education. It addresses privacy concerns, supports AI research, and provides a foundation for innovative applications across diverse fields, such as public health and precision medicine.
8.Pre-Treatment Perceived Social Support Is Associated With Chemotherapy-Induced Peripheral Neuropathy in Patients With Breast Cancer: A Longitudinal Study
Joon Sung SHIN ; Sanghyup JUNG ; Geun Hui WON ; Sun Hyung LEE ; Jaehyun KIM ; Saim JUNG ; Chan-Woo YEOM ; Kwang-Min LEE ; Kyung-Lak SON ; Jang-il KIM ; Sook Young JEON ; Han-Byoel LEE ; Bong-Jin HAHM
Psychiatry Investigation 2025;22(4):424-434
Objective:
Previous studies have reported an association between cancer-related symptoms and perceived social support (PSS). The objective of this study was to analyze whether Chemotherapy-Induced Peripheral Neuropathy (CIPN), a prevalent side effect of chemotherapy, varies according to PSS level using a validated tool for CIPN at prospective follow-up.
Methods:
A total of 39 breast cancer patients were evaluated for PSS using the Multidimensional Scale of Perceived Social Support (MSPSS) prior to chemotherapy and were subsequently grouped into one of two categories for each subscale: low-to-moderate PSS and high PSS. CIPN was prospectively evaluated using the European Organization for Research and Treatment of Cancer Quality of Life Questionnaire-Chemotherapy-Induced Peripheral Neuropathy 20 (CIPN20) at five time points. A linear mixed-effects model with square root transformation was employed to investigate whether the CIPN20 scales varied by PSS level and time point.
Results:
Statistical analysis of the MSPSS total scale and subscales revealed a significant effect of the friends subscale group and time point on the CIPN20 sensory scale. The sensory scale score of CIPN20 was found to be lower in participants with high PSS from friends in comparison to those with low-to-moderate PSS at 1 month post-chemotherapy (p=0.010).
Conclusion
This is the first study to prospectively follow the long-term effect of pre-treatment PSS from friends on CIPN. Further studies based on larger samples are required to analyze the effects of PSS on the pathophysiology of CIPN.
9.Development and Application of New Risk-Adjustment Models to Improve the Current Model for Hospital Standardized Mortality Ratio in South Korea
Hyeki PARK ; Ji-Sook CHOI ; Min Sun SHIN ; Soomin KIM ; Hyekyoung KIM ; Nahyeong IM ; Soon Joo PARK ; Donggyo SHIN ; Youngmi SONG ; Yunjung CHO ; Hyunmi JOO ; Hyeryeon HONG ; Yong-Hwa HWANG ; Choon-Seon PARK
Yonsei Medical Journal 2025;66(3):179-186
Purpose:
This study assessed the validity of the hospital standardized mortality ratio (HSMR) risk-adjusted model by comparing models that include clinical information and the current model based on administrative information in South Korea.
Materials and Methods:
The data of 53976 inpatients were analyzed. The current HSMR risk-adjusted model (Model 1) adjusts for sex, age, health coverage, emergency hospitalization status, main diagnosis, surgery status, and Charlson Comorbidity Index (CCI) using administrative data. As candidate variables, among clinical information, the American Society of Anesthesiologists score, Acute Physiology and Chronic Health Evaluation (APACHE) II, Simplified Acute Physiology Score (SAPS) 3, present on admission CCI, and cancer stage were collected. Surgery status, intensive care in the intensive care unit, and CCI were selected as proxy variables among administrative data. In-hospital death was defined as the dependent variable, and a logistic regression analysis was performed. The statistical performance of each model was compared using C-index values.
Results:
There was a strong correlation between variables in the administrative data and those in the medical records. The C-index of the existing model (Model 1) was 0.785; Model 2, which included all clinical data, had a higher C-index of 0.857. In Model 4, in which APACHE II and SAPS 3 were replaced with variables recorded in the administrative data from Model 2, the C-index further increased to 0.863.
Conclusion
The HSMR assessment model improved when clinical data were adjusted. Simultaneously, the validity of the evaluation method could be secured even if some of the clinical information was replaced with the information in the administrative data.
10.Synthetic data production for biomedical research
Yun Gyeong LEE ; Mi-Sook KWAK ; Jeong Eun KIM ; Min Sun KIM ; Dong Un NO ; Hee Youl CHAI
Osong Public Health and Research Perspectives 2025;16(2):94-99
Synthetic data, generated using advanced artificial intelligence (AI) techniques, replicates the statistical properties of real-world datasets while excluding identifiable information.Although synthetic data does not consist of actual data points, it is derived from original datasets, thereby enabling analyses that yield results comparable to those obtained with real data. Synthetic datasets are evaluated based on their utility—a measure of how effectively they mirror real data for analytical purposes. This paper presents the generation of synthetic datasets through the Healthcare Big Data Showcase Project (2019–2023). The original dataset comprises comprehensive multi-omics data from 400 individuals, including cancer survivors, chronic disease patients, and healthy participants. Synthetic data facilitates efficient access and robust analyses, serving as a practical tool for research and education. It addresses privacy concerns, supports AI research, and provides a foundation for innovative applications across diverse fields, such as public health and precision medicine.

Result Analysis
Print
Save
E-mail