Correlation between echocardiography report narratives and the risk level of congenital heart disease in children
10.3969/j.issn.1672-8467.2018.02.002
- VernacularTitle:儿童先天性心脏病超声心动图报告与个体风险的相关性分析
- Author:
Ya-Hui SHI
1
;
Zuo-Feng LI
;
Cai CHANG
;
Xiao-Yan ZHANG
Author Information
1. 同济大学生命科学与技术学院 上海200092
- Keywords:
echocardiography;
congenital heart disease;
natural language processing;
machine learning;
children
- From:
Fudan University Journal of Medical Sciences
2018;45(2):151-157
- CountryChina
- Language:Chinese
-
Abstract:
Objective To analyze the correlation between echocardiography report narratives and the risk level of congenital heart disease in children,and to validate the feasibility and value of employing text mining technique in such task.Methods Echocardiography reports were retrospectively analyzed for 1 042 children with congenital heart disease.We adopted natural language processing (NLP) technique to generate features from the clinical narratives for machine learning algorithms.Decision trees were trained to predict the risk level of patients.Model performance was evaluated by means of classification accuracy and normalized mean absolute error (NMAE),which were averaged among 50 rounds of stratified 10-fold cross validation.By analyzing branches of the decision tree,we formulated the possible decision path of a clinician and identifyied the key information in the clinical narratives.Results Compared with the auto-generated 3-grams,the selected features yielded a better performance.After feature selection,the predict accuracy was improved from 32.82% to 48.57%,while the NMAE reduced from 0.33 to 0.25.Conclusions Based on echocardiography report narratives,the risk levels of congenital heart disease in children can be evaluated by our model with an accuracy level of 75 %.Echocardiographic terms that describe the lesion provide significant information to support the clinical decision making.