- Author:
Sangjun LEE
1
;
Kwang-Pil KO
;
Jung Eun LEE
;
Inah KIM
;
Sun Ha JEE
;
Aesun SHIN
;
Sun-Seog KWEON
;
Min-Ho SHIN
;
Sangmin PARK
;
Seungho RYU
;
Sun Young YANG
;
Seung Ho CHOI
;
Jeongseon KIM
;
Sang-Wook YI
;
Daehee KANG
;
Keun-Young YOO
;
Sue K. PARK
Author Information
- Publication Type:Original Article
- From:Journal of Preventive Medicine and Public Health 2022;55(5):464-474
- CountryRepublic of Korea
- Language:English
-
Abstract:
Objectives:We introduced the cohort studies included in the Korean Cohort Consortium (KCC), focusing on large-scale cohort studies established in Korea with a prolonged follow-up period. Moreover, we also provided projections of the follow-up and estimates of the sample size that would be necessary for big-data analyses based on pooling established cohort studies, including population-based genomic studies.
Methods:We mainly focused on the characteristics of individual cohort studies from the KCC. We developed “PROFAN”, a Shiny application for projecting the follow-up period to achieve a certain number of cases when pooling established cohort studies. As examples, we projected the follow-up periods for 5000 cases of gastric cancer, 2500 cases of prostate and breast cancer, and 500 cases of non-Hodgkin lymphoma. The sample sizes for sequencing-based analyses based on a 1:1 case-control study were also calculated.
Results:The KCC consisted of 8 individual cohort studies, of which 3 were community-based and 5 were health screening-based cohorts. The population-based cohort studies were mainly organized by Korean government agencies and research institutes. The projected follow-up period was at least 10 years to achieve 5000 cases based on a cohort of 0.5 million participants. The mean of the minimum to maximum sample sizes for performing sequencing analyses was 5917-72 102.
Conclusions:We propose an approach to establish a large-scale consortium based on the standardization and harmonization of existing cohort studies to obtain adequate statistical power with a sufficient sample size to analyze high-risk groups or rare cancer subtypes.