A Method for Extracting Data Elements from Chinese Electronic Medical Records
10.3969/j.issn.1673-6036.2024.08.013
- VernacularTitle:中文电子病历数据元抽取方法
- Author:
Weijia GUO
1
;
Shaoyou GUO
Author Information
1. 河南省图书馆 郑州 450052
- Keywords:
electronic medical records(EMR);
data element;
ALBERT;
sequence labeling;
token
- From:
Journal of Medical Informatics
2024;45(8):78-83
- CountryChina
- Language:Chinese
-
Abstract:
Purpose/Significance A method is proposed for extracting data elements from electronic medical records(EMR)based on national standards,helping to achieve fine-grained sharing of EMR data.Method/Process The ALBERT,BILSTM and CRF models are used to perform sequence labeling on EMR,and a set of candidate data elements based on labeling results are generated.For any can-didate data elements,the contextual information is collected to form an enhanced key vector.Then the similarity between the vector and the standard vector is calculated to determine whether the candidate data element is valid.Result/Conclusion The F1 value is 90.32%,indicating the proposed method has a good performance.