Application and design of lymphoma dataset based on real-world research
10.3760/cma.j.cn113565-113565-20220512-00082
- VernacularTitle:基于真实世界研究的淋巴瘤研究数据库建设与应用
- Author:
Lan MI
1
;
Meng WU
;
Feier FENG
;
Tingting DU
;
Luersulitan REYIZHA
;
Mengmeng LIN
;
Mingfang NIU
;
Yuqin SONG
;
Yan XIE
;
Jun ZHU
Author Information
1. 北京大学肿瘤医院暨北京市肿瘤防治研究所,淋巴肿瘤内科,恶性肿瘤发病机制及转化研究教育部重点实验室,北京 100142
- Keywords:
Real-world research;
Electronic medical records;
Lymphoma;
Database;
Natural language;
Processing artificial intelligence
- From:
Chinese Journal of Medical Science Research Management
2023;36(1):18-23
- CountryChina
- Language:Chinese
-
Abstract:
Objective:Considering the large amount and poor quality of clinical data, this study aims to explore the establishment of high-quality research database and its role in real-world research by taking the establishment of lymphoma research database as an example.Methods:The expert opinions in the field of lymphoma were collected, and the relevant guidelines and standards were referenced to establish a standard medical knowledge dataset. The electronic diagnosis and treatment data of lymphoma patients treated in Peking University Cancer Hospital from February 2005 to December 2020 were retrospectively extracted, the deep Learning, natural language processing were adopted to build a dynamic intelligent information integration and processing system of " lymphoma database based on electronic medical record system - biological sample information database - extended genetic information database" .Results:The research database not only meets the research needs of clinical researchers, but also realizes the management of traces in the whole process of application, approval, traceability and analysis of hospital medical record data and biological sample data. The total number of research variables in the database was 668, and the structured variables accounted for 46.0%. On December 25, 2021, there were 68 687 lymphoma patients in the database, the ratio of male to female patients was 8/9, and the proportion of patients with ≥3 visits accounted for 23.0%. In addition, researchers can superimpose searches in the database according to the target conditions, display the targeted medical records according to research hypothesis, and then establish a research cohort, conducting statistical modeling, and mining data information.Conclusions:By integrating management processes and using new natural language artificial intelligence technology to establish a high-level evidence-based database, it is helpful for the interconnection and resource sharing of hospital information systems, so as to achieve the purpose of providing reliable and detailed data for real-world research.