Entity Recognition in Famous Medical Records Based on BRL Neural Network Model
10.13422/j.cnki.syfjx.20241165
- VernacularTitle:基于BRL神经网络模型的名家医案实体识别
- Author:
Hang YANG
1
;
Yehui PENG
1
;
Wei YANG
2
;
Jiaheng WANG
2
;
Zhiwei ZHAO
2
;
Wenyuan XU
2
;
Yuxin LI
2
;
Yan ZHU
3
;
Lihong LIU
3
Author Information
1. School of Mathematics and Computational Science,Hunan University of Science and Technology,Xiangtan 411201,China
2. Institute of Basic Research in Clinical Medicine, China Academy of Chinese Medical Sciences,Beijing 100700,China
3. Institute of Information on Traditional Chinese Medicine,China Academy of Chinese Medical Sciences,Beijing 100700,China
- Publication Type:Journal Article
- Keywords:
named entity recognition;
pre-trained model;
radical embedding;
associative word embedding;
famous medical records
- From:
Chinese Journal of Experimental Traditional Medical Formulae
2024;30(24):167-173
- CountryChina
- Language:Chinese
-
Abstract:
ObjectiveIn order to improve the recognition accuracy of named entities in medical record texts and realize the effective mining and utilization of medical record knowledge, a Bert-Radical-Lexicon(BRL) neural network model is constructed to recognize medical record entities with respect to the characteristics of medical record texts. MethodWe selected 408 medical records related to hypertension from the the Complete Library of Famous Medical Records of Chinese Dynasties and constructed a dataset consisting of 1 672 medical records by manually labeling. Then, we randomly divided the dataset into three subsets, including the training set(1 004 cases), the testing set (334 cases) and the validation set(334 cases). Based on this dataset, we built a BRL model that fused various text features of medical records, as well as its variants BRL-B, BRL-L and BRL-R, and a baseline model Base for experiments. During the model training phase, we trained the above models using the training set to reduce the risk of overfitting. We continuously monitored the performance of each model on the validation set during training and saved the model with the best performance. Finally, we evaluated the performance of these models on the testing set. ResultCompared with other models, the BRL model had the best performance in the medical records named entity recognition task, with an overall recognition precision of 90.09%, a recall of 90.61%, and the harmonic mean of the precision and recall(F1) of 90.35% for eight types of entities, including disease, symptom, tongue manifestation, pulse condition, syndrome, method of treatment, prescription and traditional Chinese medicine(TCM). Compared with the Base model, the BRL model improved the overall F1 value of entity recognition by 5.22%, and the F1 value of pulse condition entity increased by 6.92%, which was the largest increase. ConclusionBy incorporating a variety of medical record text features in the embedding layer, the BRL neural network model has stronger named entity recognition ability, and thus extracts more accurate and reliable TCM clinical information.