Automatic labeling and extraction of terms in natural language processing in acupuncture clinical literature.
10.13703/j.0255-2930.20211107-k0002
- Author:
Hua-Yun LIU
1
;
Chen-Jing HAN
1
;
Jie XIONG
2
;
Hai-Yan LI
2
;
Lei LEI
2
;
Bao-Yan LIU
3
Author Information
1. Graduate School of Tianjin University of TCM, Tianjin 301617, China.
2. Institute of Information on Traditional Chinese Medicine, China Academy of Chinese Medical Sciences.
3. China Academy of Chinese Medical Sciences, Beijing 100700.
- Publication Type:Journal Article
- Keywords:
Bi LSTM-CRF;
acupuncture clinical literature;
named entity recognition;
term recognition
- MeSH:
Acupuncture Therapy;
Electronic Health Records;
Natural Language Processing
- From:
Chinese Acupuncture & Moxibustion
2022;42(3):327-331
- CountryChina
- Language:Chinese
-
Abstract:
The paper analyzes the specificity of term recognition in acupuncture clinical literature and compares the advantages and disadvantages of three named entity recognition (NER) methods adopted in the field of traditional Chinese medicine. It is believed that the bi-directional long short-term memory networks-conditional random fields (Bi LSTM-CRF) may communicate the context information and complete NER by using less feature rules. This model is suitable for term recognition in acupuncture clinical literature. Based on this model, it is proposed that the process of term recognition in acupuncture clinical literature should include 4 aspects, i.e. literature pretreatment, sequence labeling, model training and effect evaluation, which provides an approach to the terminological structurization in acupuncture clinical literature.