1.Comparison of survival prediction models for pancreatic cancer: Cox model versus machine learning models
Hyunsuk KIM ; Taesung PARK ; Jinyoung JANG ; Seungyeoun LEE
Genomics & Informatics 2022;20(2):e23-
A survival prediction model has recently been developed to evaluate the prognosis of resected nonmetastatic pancreatic ductal adenocarcinoma based on a Cox model using two nationwide databases: Surveillance, Epidemiology and End Results (SEER) and Korea Tumor Registry System-Biliary Pancreas (KOTUS-BP). In this study, we applied two machine learning methods—random survival forests (RSF) and support vector machines (SVM)—for survival analysis and compared their prediction performance using the SEER and KOTUS-BP datasets. Three schemes were used for model development and evaluation. First, we utilized data from SEER for model development and used data from KOTUS-BP for external evaluation. Second, these two datasets were swapped by taking data from KOTUS-BP for model development and data from SEER for external evaluation. Finally, we mixed these two datasets half and half and utilized the mixed datasets for model development and validation. We used 9,624 patients from SEER and 3,281 patients from KOTUS-BP to construct a prediction model with seven covariates: age, sex, histologic differentiation, adjuvant treatment, resection margin status, and the American Joint Committee on Cancer 8th edition T-stage and N-stage. Comparing the three schemes, the performance of the Cox model, RSF, and SVM was better when using the mixed datasets than when using the unmixed datasets. When using the mixed datasets, the C-index, 1-year, 2-year, and 3-year time-dependent areas under the curve for the Cox model were 0.644, 0.698, 0.680, and 0.687, respectively. The Cox model performed slightly better than RSF and SVM.
2.CORRIGENDUM: Diagnostic model for pancreatic cancer using a multi-biomarker panel
Yoo Jin CHOI ; Woongchang YOON ; Areum LEE ; Youngmin HAN ; Yoonhyeong BYUN ; Jae Seung KANG ; Hongbeom KIM ; Wooil KWON ; Young-Ah SUH ; Yongkang KIM ; Seungyeoun LEE ; Junghyun NAMKUNG ; Sangjo HAN ; Yonghwan CHOI ; Jin Seok HEO ; Joon Oh PARK ; Joo Kyung PARK ; Song Cheol KIM ; Chang Moo KANG ; Woo Jin LEE ; Taesung PARK ; Jin-Young JANG
Annals of Surgical Treatment and Research 2021;100(4):252-
3.CORRIGENDUM: Diagnostic model for pancreatic cancer using a multi-biomarker panel
Yoo Jin CHOI ; Woongchang YOON ; Areum LEE ; Youngmin HAN ; Yoonhyeong BYUN ; Jae Seung KANG ; Hongbeom KIM ; Wooil KWON ; Young-Ah SUH ; Yongkang KIM ; Seungyeoun LEE ; Junghyun NAMKUNG ; Sangjo HAN ; Yonghwan CHOI ; Jin Seok HEO ; Joon Oh PARK ; Joo Kyung PARK ; Song Cheol KIM ; Chang Moo KANG ; Woo Jin LEE ; Taesung PARK ; Jin-Young JANG
Annals of Surgical Treatment and Research 2021;100(4):252-
4.Diagnostic model for pancreatic cancer using a multi-biomarker panel
Yoo Jin CHOI ; Woongchang YOON ; Areum LEE ; Youngmin HAN ; Yoonhyeong BYUN ; Jae Seung KANG ; Hongbeom KIM ; Wooil KWON ; Young-Ah SUH ; Yongkang KIM ; Seungyeoun LEE ; Junghyun NAMKUNG ; Sangjo HAN ; Yonghwan CHOI ; Jin Seok HEO ; Joon Oh PARK ; Joo Kyung PARK ; Song Cheol KIM ; Chang Moo KANG ; Woo Jin LEE ; Taesung PARK ; Jin-Young JANG
Annals of Surgical Treatment and Research 2021;100(3):144-153
Purpose:
Diagnostic biomarkers of pancreatic ductal adenocarcinoma (PDAC) have been used for early detection to reduce its dismal survival rate. However, clinically feasible biomarkers are still rare. Therefore, in this study, we developed an automated multi-marker enzyme-linked immunosorbent assay (ELISA) kit using 3 biomarkers (leucine-rich alpha-2-glycoprotein [LRG1], transthyretin [TTR], and CA 19-9) that were previously discovered and proposed a diagnostic model for PDAC based on this kit for clinical usage.
Methods:
Individual LRG1, TTR, and CA 19-9 panels were combined into a single automated ELISA panel and tested on 728 plasma samples, including PDAC (n = 381) and normal samples (n = 347). The consistency between individual panels of 3 biomarkers and the automated multi-panel ELISA kit were accessed by correlation. The diagnostic model was developed using logistic regression according to the automated ELISA kit to predict the risk of pancreatic cancer (high-, intermediate-, and low-risk groups).
Results:
The Pearson correlation coefficient of predicted values between the triple-marker automated ELISA panel and the former individual ELISA was 0.865. The proposed model provided reliable prediction results with a positive predictive value of 92.05%, negative predictive value of 90.69%, specificity of 90.69%, and sensitivity of 92.05%, which all simultaneously exceed 90% cutoff value.
Conclusion
This diagnostic model based on the triple ELISA kit showed better diagnostic performance than previous markers for PDAC. In the future, it needs external validation to be used in the clinic.
5.Development and External Validation of Survival Prediction Model for Pancreatic Cancer Using Two Nationwide Databases: Surveillance, Epidemiology and End Results (SEER) and Korea Tumor Registry System-Biliary Pancreas (KOTUS-BP)
Jae Seung KANG ; Lydia MOK ; Jin Seok HEO ; In Woong HAN ; Sang Hyun SHIN ; Yoo-Seok YOON ; Ho-Seong HAN ; Dae Wook HWANG ; Jae Hoon LEE ; Woo Jung LEE ; Sang Jae PARK ; Joon Seong PARK ; Yonghoon KIM ; Huisong LEE ; Young-Dong YU ; Jae Do YANG ; Seung Eun LEE ; Il Young PARK ; Chi-Young JEONG ; Younghoon ROH ; Seong-Ryong KIM ; Ju Ik MOON ; Sang Kuon LEE ; Hee Joon KIM ; Seungyeoun LEE ; Hongbeom KIM ; Wooil KWON ; Chang-Sup LIM ; Jin-Young JANG ; Taesung PARK
Gut and Liver 2021;15(6):912-921
Background/Aims:
Several prediction models for evaluating the prognosis of nonmetastatic resected pancreatic ductal adenocarcinoma (PDAC) have been developed, and their performances were reported to be superior to that of the 8th edition of the American Joint Committee on Cancer (AJCC) staging system. We developed a prediction model to evaluate the prognosis of resected PDAC and externally validated it with data from a nationwide Korean database.
Methods:
Data from the Surveillance, Epidemiology and End Results (SEER) database were utilized for model development, and data from the Korea Tumor Registry System-Biliary Pancreas (KOTUS-BP) database were used for external validation. Potential candidate variables for model development were age, sex, histologic differentiation, tumor location, adjuvant chemotherapy, and the AJCC 8th staging system T and N stages. For external validation, the concordance index (C-index) and time-dependent area under the receiver operating characteristic curve (AUC) were evaluated.
Results:
Between 2004 and 2016, data from 9,624 patients were utilized for model development, and data from 3,282 patients were used for external validation. In the multivariate Cox proportional hazard model, age, sex, tumor location, T and N stages, histologic differentiation, and adjuvant chemotherapy were independent prognostic factors for resected PDAC. After an exhaustive search and 10-fold cross validation, the best model was finally developed, which included all prognostic variables. The C-index, 1-year, 2-year, 3-year, and 5-year time-dependent AUCs were 0.628, 0.650, 0.665, 0.675, and 0.686, respectively.
Conclusions
The survival prediction model for resected PDAC could provide quantitative survival probabilities with reliable performance. External validation studies with other nationwide databases are needed to evaluate the performance of this model.
6.Public Attention to Crime of Schizophrenia and Its Correlation with Use of Mental Health Services in Patients with Schizophrenia
Hyunwoo PARK ; Yu Sang LEE ; Sang Yup LEE ; Seungyeoun LEE ; Kyung Sue HONG ; Shinsuke KOIKE ; Jun Soo KWON
Korean Journal of Schizophrenia Research 2019;22(2):34-41
OBJECTIVES: This study was performed to examine the effects of the public attention to ‘crime of schizophrenia’ on the use of mental health services in patients with schizophrenia using big data analysis. METHODS: Data on the frequency of internet searches for ‘crime of schizophrenia’ and the patterns of mental health service utilization by patients with schizophrenia spectrum disorders by month were collected from Naver big data and the Health Insurance Review and Assessment Services in Korea, respectively. Their correlations in the same and following month for lagged effect were examined. RESULTS: The number of outpatients correlated negatively with public attention to ‘crime of schizophrenia’ in the same month. The lagged relationship between public attention and the number of admissions in psychiatric wards was also found. In terms of sex differences, the use of outpatient services among female patients correlated negatively with public attention in the same month while the number of male patients' admissions in both same and following month correlated positively with public attention. CONCLUSION: These findings suggested that public attention to ‘crime of schizophrenia’ could negatively affect illness behavior in patients with schizophrenia.
Crime
;
Female
;
Humans
;
Illness Behavior
;
Insurance, Health
;
Internet
;
Korea
;
Male
;
Mental Health Services
;
Mental Health
;
Outpatients
;
Schizophrenia
;
Sex Characteristics
;
Statistics as Topic
7.Review of statistical methods for survival analysis using genomic data
Genomics & Informatics 2019;17(4):e41-
Survival analysis mainly deals with the time to event, including death, onset of disease, and bankruptcy. The common characteristic of survival analysis is that it contains “censored†data, in which the time to event cannot be completely observed, but instead represents the lower bound of the time to event. Only the occurrence of either time to event or censoring time is observed. Many traditional statistical methods have been effectively used for analyzing survival data with censored observations. However, with the development of high-throughput technologies for producing “omics†data, more advanced statistical methods, such as regularization, should be required to construct the predictive survival model with high-dimensional genomic data. Furthermore, machine learning approaches have been adapted for survival analysis, to fit nonlinear and complex interaction effects between predictors, and achieve more accurate prediction of individual survival probability. Presently, since most clinicians and medical researchers can easily assess statistical programs for analyzing survival data, a review article is helpful for understanding statistical methods used in survival analysis. We review traditional survival methods and regularization methods, with various penalty functions, for the analysis of high-dimensional genomics, and describe machine learning techniques that have been adapted to survival analysis.
8.Review of statistical methods for survival analysis using genomic data
Genomics & Informatics 2019;17(4):41-
Survival analysis mainly deals with the time to event, including death, onset of disease, and bankruptcy. The common characteristic of survival analysis is that it contains “censored” data, in which the time to event cannot be completely observed, but instead represents the lower bound of the time to event. Only the occurrence of either time to event or censoring time is observed. Many traditional statistical methods have been effectively used for analyzing survival data with censored observations. However, with the development of high-throughput technologies for producing “omics” data, more advanced statistical methods, such as regularization, should be required to construct the predictive survival model with high-dimensional genomic data. Furthermore, machine learning approaches have been adapted for survival analysis, to fit nonlinear and complex interaction effects between predictors, and achieve more accurate prediction of individual survival probability. Presently, since most clinicians and medical researchers can easily assess statistical programs for analyzing survival data, a review article is helpful for understanding statistical methods used in survival analysis. We review traditional survival methods and regularization methods, with various penalty functions, for the analysis of high-dimensional genomics, and describe machine learning techniques that have been adapted to survival analysis.
Bankruptcy
;
Genomics
;
Machine Learning
;
Methods
;
Survival Analysis
9.Gene-Gene Interaction Analysis for the Accelerated Failure Time Model Using a Unified Model-Based Multifactor Dimensionality Reduction Method.
Seungyeoun LEE ; Donghee SON ; Wenbao YU ; Taesung PARK
Genomics & Informatics 2016;14(4):166-172
Although a large number of genetic variants have been identified to be associated with common diseases through genome-wide association studies, there still exits limitations in explaining the missing heritability. One approach to solving this missing heritability problem is to investigate gene-gene interactions, rather than a single-locus approach. For gene-gene interaction analysis, the multifactor dimensionality reduction (MDR) method has been widely applied, since the constructive induction algorithm of MDR efficiently reduces high-order dimensions into one dimension by classifying multi-level genotypes into high- and low-risk groups. The MDR method has been extended to various phenotypes and has been improved to provide a significance test for gene-gene interactions. In this paper, we propose a simple method, called accelerated failure time (AFT) UM-MDR, in which the idea of a unified model-based MDR is extended to the survival phenotype by incorporating AFT-MDR into the classification step. The proposed AFT UM-MDR method is compared with AFT-MDR through simulation studies, and a short discussion is given.
Classification
;
Genome-Wide Association Study
;
Genotype
;
Methods*
;
Multifactor Dimensionality Reduction*
;
Phenotype
10.The Sex-Related Differences of EEG Coherences between Patients with Bipolar Disorder and Controls.
Hyunju YOU ; Yu Sang LEE ; Eunsoog AN ; Donghwa JEONG ; Seongkyun KIM ; Jaeseung JEONG ; Yongtae KWAK ; Seungyeoun LEE
Journal of the Korean Society of Biological Psychiatry 2015;22(4):205-215
OBJECTIVES: Sex hormones exposure during the prenatal period has an effect on cerebral lateralization. Male brains are thought to be more lateralized than female brains. Bipolar disorder was known to show abnormalities in cerebral laterality whose characteristics could be estimated by electroencephalography (EEG) coherences. We studied sex-related differences of EEG coherences between healthy controls and patients with bipolar disorder to examine the sex effects in the genesis of bipolar disorder. METHODS: Participants were 25 patients with bipolar disorder (11 male, 14 female) and 46 healthy controls (23 male, 23 female). EEG was recorded in the eyes closed resting state. To examine dominant EEG coherence associated with sex differences in both groups within five frequency bands (delta, theta, alpha, beta, and gamma) across several brain regions, statistical analyses were performed using analysis of covariance. RESULTS: Though statistically meaningful results were not found, some remarkable findings were noted. Healthy control females showed more increased interhemispheric coherences than control males in gamma frequency band. There were no differences in the intrahemispheric coherences between the healthy control males and females. In patients with bipolar disorder, female dominant pattern in interhemispheric coherences was attenuated compared with healthy control. CONCLUSIONS: Sex differences of EEG coherences, which could be a marker for cerebral laterality, were attenuated in patients with bipolar disorder compared with healthy controls. These results imply that abnormal sex hormone exposure during early development might play some role in the pathogenesis of bipolar disorder.
Bipolar Disorder*
;
Brain
;
Electroencephalography*
;
Female
;
Gonadal Steroid Hormones
;
Humans
;
Male
;
Sex Characteristics

Result Analysis
Print
Save
E-mail