Named entity recognition of eligibility criteria for clinical trials based on BioBERT and BiLSTM
10.3969/j.issn.1005-202X.2024.01.018
- VernacularTitle:基于BioBERT与BiLSTM的临床试验纳排标准命名实体识别
- Author:
Shengqing LI
1
;
Qianmin SU
;
Jihan HUANG
Author Information
1. 上海工程技术大学电子电气工程学院,上海 201620
- Keywords:
eligibility criteria;
named entity recognition;
bidirectional long short-term memory network;
conditional random field;
clinical trial
- From:
Chinese Journal of Medical Physics
2024;41(1):125-132
- CountryChina
- Language:Chinese
-
Abstract:
Objective To present a named entity recognition method referred to as BioBERT-Att-BiLSTM-CRF for eligibility criteria based on the BioBERT pretrained model.The method can automatically extract relevant information from clinical trials and provide assistance in efficiently formulating eligibility criteria.Methods Based on the UMLS medical semantic network and expert-defined rules,the study established medical entity annotation rules and constructed a named entity recognition corpus to clarify the entity recognition task.BioBERT-Att-BiLSTM-CRF converted the text into BioBERT vectors and inputted them into a bidirectional long short-term memory network to capture contextual semantic features.Meanwhile,attention mechanisms were applied to extract key features,and a conditional random field was used for decoding and outputting the optimal label sequence.Results BioBERT-Att-BiLSTM-CRF outperformed other baseline models on the eligibility criteria named entity recognition dataset.Conclusion BioBERT-Att-BiLSTM-CRF can efficiently extract eligibility criteria-related information from clinical trials,thereby enhancing the scientific validity of clinical trial registration data and providing assistance in the formulation of eligibility criteria for clinical trials.