1.Cancer risk in Korean patients with gout
Yoon-Jeong OH ; Yun Jong LEE ; Eunyoung LEE ; Bumhee PARK ; Jae-Woo KWON ; Jeongwon HEO ; Ki Won MOON
The Korean Journal of Internal Medicine 2022;37(2):460-467
Background/Aims:
Using a nationwide cohort, we investigated the cancer risk in Korean patients with gout.
Methods:
Data were obtained from the Korean National Health Insurance Service Database. Patients with gout were defined as those aged ≥ 20 years who were diagnosed with gout and received anti-gout medication (allopurinol, colchicine, and benzbromarone) between 2008 and 2010. Patients with nail disorders were randomly assigned to a control group (1:1 ratio) after frequency matching for age and sex. Cancer incidence was then investigated between 2012 and 2018. Cox proportional hazard regression analysis was used to investigate the association between gout and cancer after adjusting for concomitant diseases.
Results:
This study included 179,930 patients with gout and an equal number of matched controls. The incidence of overall cancer was higher in patients with gout than in controls (incidence rate ratio, 1.08). Cox proportional hazards regression analysis showed that gout was associated with a hazard ratio of 1.053 (95% confidence interval ,1.031 to 1.077) after adjusting for concomitant diseases.
Conclusions
Gout was associated with a significantly high risk of cancer, especially esophageal, stomach, colon, liver, pancreatic, lung, ovarian, renal, and bladder cancers.
2.Word Embedding Reveals Cyfra 21-1 as a Biomarker for Chronic Obstructive Pulmonary Disease
Jeongwon HEO ; Da Hye MOON ; Yoonki HONG ; So Hyeon BAK ; Jeeyoung KIM ; Joo Hyun PARK ; Byoung-Doo OH ; Yu-Seop KIM ; Woo Jin KIM
Journal of Korean Medical Science 2021;36(35):e224-
Background:
Although patients with chronic obstructive pulmonary disease (COPD) experience high morbidity and mortality worldwide, few biomarkers are available for COPD.Here, we analyzed potential biomarkers for the diagnosis of COPD by using word embedding.
Methods:
To determine which biomarkers are likely to be associated with COPD, we selected respiratory disease-related biomarkers. Degrees of similarity between the 26 selected biomarkers and COPD were measured by word embedding. And we infer the similarity with COPD through the word embedding model trained in the large-capacity medical corpus, and search for biomarkers with high similarity among them. We used Word2Vec, Canonical Correlation Analysis, and Global Vector for word embedding. We evaluated the associations of selected biomarkers with COPD parameters in a cohort of patients with COPD.
Results:
Cytokeratin 19 fragment (Cyfra 21-1) was selected because of its high similarity and its significant correlation with the COPD phenotype. Serum Cyfra 21-1 levels were determined in patients with COPD and controls (4.3 ± 5.9 vs. 3.9 ± 3.6 ng/mL, P = 0.611). The emphysema index was significantly correlated with the serum Cyfra 21-1 level (correlation coefficient = 0.219,P = 0.015).
Conclusion
Word embedding may be used for the discovery of biomarkers for COPD and Cyfra 21-1 may be used as a biomarker for emphysema. Additional studies are needed to validate Cyfra 21-1 as a biomarker for COPD.
3.Word Embedding Reveals Cyfra 21-1 as a Biomarker for Chronic Obstructive Pulmonary Disease
Jeongwon HEO ; Da Hye MOON ; Yoonki HONG ; So Hyeon BAK ; Jeeyoung KIM ; Joo Hyun PARK ; Byoung-Doo OH ; Yu-Seop KIM ; Woo Jin KIM
Journal of Korean Medical Science 2021;36(35):e224-
Background:
Although patients with chronic obstructive pulmonary disease (COPD) experience high morbidity and mortality worldwide, few biomarkers are available for COPD.Here, we analyzed potential biomarkers for the diagnosis of COPD by using word embedding.
Methods:
To determine which biomarkers are likely to be associated with COPD, we selected respiratory disease-related biomarkers. Degrees of similarity between the 26 selected biomarkers and COPD were measured by word embedding. And we infer the similarity with COPD through the word embedding model trained in the large-capacity medical corpus, and search for biomarkers with high similarity among them. We used Word2Vec, Canonical Correlation Analysis, and Global Vector for word embedding. We evaluated the associations of selected biomarkers with COPD parameters in a cohort of patients with COPD.
Results:
Cytokeratin 19 fragment (Cyfra 21-1) was selected because of its high similarity and its significant correlation with the COPD phenotype. Serum Cyfra 21-1 levels were determined in patients with COPD and controls (4.3 ± 5.9 vs. 3.9 ± 3.6 ng/mL, P = 0.611). The emphysema index was significantly correlated with the serum Cyfra 21-1 level (correlation coefficient = 0.219,P = 0.015).
Conclusion
Word embedding may be used for the discovery of biomarkers for COPD and Cyfra 21-1 may be used as a biomarker for emphysema. Additional studies are needed to validate Cyfra 21-1 as a biomarker for COPD.
4.Plasma CRABP2 as a Novel Biomarker in Patients with Non-Small Cell Lung Cancer.
Do Jun KIM ; Woo Jin KIM ; Myoungnam LIM ; Yoonki HONG ; Seung Joon LEE ; Seok Ho HONG ; Jeongwon HEO ; Hui Young LEE ; Seon Sook HAN
Journal of Korean Medical Science 2018;33(26):e178-
BACKGROUND: Lung cancer is the most common cause of cancer-related mortality worldwide. We previously reported the identification of a new genetic marker, cellular retinoic acid binding protein 2 (CRABP2), in lung cancer tissues. The aim of this study was to assess plasma levels of CRABP2 from patients with non-small cell lung cancer (NSCLC). METHODS: Blood samples that were collected from 122 patients with NSCLC between September 2009 and September 2013 were selected for the analysis, along with samples from age- (± 5 years), sex-, and cigarette smoking history (± 10 pack-years [PY])-matched controls from the Korea Biobank Network. The control specimens were from patients who were without malignancies or pulmonary diseases. We measured plasma levels of CRABP2 using commercially available enzyme-linked immunosorbent assay kits. RESULTS: The mean age of the NSCLC patients was 71.8 ± 8.9 years, and the median cigarette smoking history was 32 PY (range, 0–150 PY). Plasma CRABP2 levels were significantly higher in patients with NSCLC than in the matched controls (37.63 ± 28.71 ng/mL vs. 24.09 ± 21.09 ng/mL, P < 0.001). Higher plasma CRABP2 levels were also correlated with lower survival rates in NSCLC patients (P = 0.014). CONCLUSION: Plasma CRABP2 levels might be a novel diagnostic and prognostic marker in NSCLC.
Biomarkers
;
Carcinoma, Non-Small-Cell Lung*
;
Carrier Proteins
;
Enzyme-Linked Immunosorbent Assay
;
Genetic Markers
;
Humans
;
Korea
;
Lung Diseases
;
Lung Neoplasms
;
Mortality
;
Plasma*
;
Smoking
;
Survival Rate
;
Tretinoin
5.Early Prediction of Mortality for Septic Patients Visiting Emergency Room Based on Explainable Machine Learning: A Real-World Multicenter Study
Sang Won PARK ; Na Young YEO ; Seonguk KANG ; Taejun HA ; Tae-Hoon KIM ; DooHee LEE ; Dowon KIM ; Seheon CHOI ; Minkyu KIM ; DongHoon LEE ; DoHyeon KIM ; Woo Jin KIM ; Seung-Joon LEE ; Yeon-Jeong HEO ; Da Hye MOON ; Seon-Sook HAN ; Yoon KIM ; Hyun-Soo CHOI ; Dong Kyu OH ; Su Yeon LEE ; MiHyeon PARK ; Chae-Man LIM ; Jeongwon HEO ; On behalf of the Korean Sepsis Alliance (KSA) Investigators
Journal of Korean Medical Science 2024;39(5):e53-
Background:
Worldwide, sepsis is the leading cause of death in hospitals. If mortality rates in patients with sepsis can be predicted early, medical resources can be allocated efficiently. We constructed machine learning (ML) models to predict the mortality of patients with sepsis in a hospital emergency department.
Methods:
This study prospectively collected nationwide data from an ongoing multicenter cohort of patients with sepsis identified in the emergency department. Patients were enrolled from 19 hospitals between September 2019 and December 2020. For acquired data from 3,657 survivors and 1,455 deaths, six ML models (logistic regression, support vector machine, random forest, extreme gradient boosting [XGBoost], light gradient boosting machine, and categorical boosting [CatBoost]) were constructed using fivefold cross-validation to predict mortality. Through these models, 44 clinical variables measured on the day of admission were compared with six sequential organ failure assessment (SOFA) components (PaO 2 /FIO 2 [PF], platelets (PLT), bilirubin, cardiovascular, Glasgow Coma Scale score, and creatinine).The confidence interval (CI) was obtained by performing 10,000 repeated measurements via random sampling of the test dataset. All results were explained and interpreted using Shapley’s additive explanations (SHAP).
Results:
Of the 5,112 participants, CatBoost exhibited the highest area under the curve (AUC) of 0.800 (95% CI, 0.756–0.840) using clinical variables. Using the SOFA components for the same patient, XGBoost exhibited the highest AUC of 0.678 (95% CI, 0.626–0.730). As interpreted by SHAP, albumin, lactate, blood urea nitrogen, and international normalization ratio were determined to significantly affect the results. Additionally, PF and PLTs in the SOFA component significantly influenced the prediction results.
Conclusion
Newly established ML-based models achieved good prediction of mortality in patients with sepsis. Using several clinical variables acquired at the baseline can provide more accurate results for early predictions than using SOFA components. Additionally, the impact of each variable was identified.