Relation Extraction of Traditional Chinese Medicine Prescription and Disease Based on Literature Abstracts Data
- VernacularTitle:面向文摘的中药方剂与疾病关系抽取
- Author:
Xiaohuan YANG
1
;
Yahui SHAN
;
Dan XIE
;
Xiaodong LI
Author Information
- Keywords: Relation extraction of traditional Chinese medicine prescription and disease Relation extraction of traditional Chinese medicine prescription and disease; data extraction; traditional Chinese medicine data extraction; web crawler technology
- From: World Science and Technology-Modernization of Traditional Chinese Medicine 2017;19(7):1167-1172
- CountryChina
- Language:Chinese
- Abstract: This paper studied the correlation between traditional Chinese medicine (TCM) prescription and disease based on machine learning.This paper selected TCM literature abstract data in the TCM category of the China National Knowledge Infrastructure (CNKI) database by crawler technology.After data cleaning,lexicon building,word segmentation and other related basic pre-treatment work,it uses natural language processing technique to extract the feature of the web text data,constructs the Support Vector Machine (SVM) classification model,and extracts the relation between TCM prescription and disease.The results showed that among 1073581 abstracts,204780 sentences,which included both TCM prescription and the disease according to dictionaries,were filtered.The SVM classification model whose feature is constructed by constituency parser is in a better accuracy,which achieved 87%.Applying the SVM model in filtered sentences,this study obtained the relation triples between TCM prescription and the disease.It was concluded that by using the method of machine learning to extract relation on abstract data from the CNKI database,the extracted relation triples of TCM prescription and disease will take a positive effect on the research of disease treatment by TCM prescription.