Data harmonization and sharing in study cohorts of respiratory diseases.
10.3760/cma.j.issn.0254-6450.2018.02.019
- Author:
Y X SUN
1
;
Z C PEI
;
S Y ZHAN
Author Information
1. Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing 100191, China.
- Publication Type:Journal Article
- Keywords:
Cohort study of respiratory diseases;
Common data model;
Data harmonization;
Data sharing
- MeSH:
Data Collection/standards*;
Databases, Factual/standards*;
Delivery of Health Care/organization & administration*;
Humans;
Information Dissemination;
Pulmonary Disease, Chronic Obstructive
- From:
Chinese Journal of Epidemiology
2018;39(2):233-239
- CountryChina
- Language:Chinese
-
Abstract:
Objective: Chronic obstructive pulmonary disease, asthma, interstitial lung disease and pulmonary thromboembolism are the most common and severe respiratory diseases, which seriously jeopardizing the health of the Chinese citizens. Large-scale prospective cohort studies are needed to explore the relationships between potential risk factors and respiratory disease outcomes and to observe disease prognoses through long-term follow-ups. We aimed to develop a common data model (CDM) for cohort studies on respiratory diseases, in order to harmonize and facilitate the exchange, pooling, sharing, and storing of data from multiple sources to serve the purpose of reusing or uniforming those follow-up data appeared in the cohorts. Methods: The process of developing this CDM of respiratory diseases would follow the steps as: ①Reviewing the international standards, including the Clinical Data Interchange Standards Consortium (CDISC), Clinical Data Acquisition Standards Harmonization (CDASH) and the Observational Medical Outcomes Partnership (OMOP) CDM; ②Summarizing four cohort studies of respiratory diseases recruited in this research and assessing the data availability; ③Developing a CDM related to respiratory diseases. Results: Data on recruited cohorts shared a few similar domains but with various schema. The cohorts also shared homogeneous data collection purposes for future follow-up studies, making the harmonization of current and future data feasible. The derived CDM would include two parts: ①thirteen common domains for all the four cohorts and derived variables from disparate questions with a common schema, ②additional domains designed upon disease-specific research needs, as well as additional variables that were disease-specific but not initially included in the common domains. Conclusion: Data harmonization appeared essential for sharing, comparing and pooled analyses, both retrospectively and prospectively. CDM was needed to convert heterogeneous data from multiple studies into one harmonized dataset. The use of a CDM in multicenter respiratory cohort studies would make the constant collection of uniformed data possible, so to guarantee the data exchange and sharing in the future.