Machine learning models for analyzing valvular heart disease combined with atrial fibrillation using electronic health records
- VernacularTitle:利用电子健康记录分析心脏瓣膜疾病合并心房颤动的机器学习模型
- Author:
Nuoyangfan LEI
1
;
Qi TONG
2
;
Yiwen ZHANG
1
;
Zhengjie WANG
2
;
Tao LI
2
;
Fan PAN
3
;
Yongjun QIAN
2
Author Information
1. School of Computer Science (School of Software), Sichuan University, Chengdu, 610065, P. R. China
2. Department of Cardiovascular Surgery, West China Hospital, Sichuan University, Chengdu, 610041, P. R. China
3. School of Electronic Information, Sichuan University, Chengdu, 610065, P. R. China
- Publication Type:Journal Article
- Keywords:
Atrial fibrillation;
valvular heart disease;
machine learning;
risk prediction;
interpretable analysis
- From:
Chinese Journal of Clinical Thoracic and Cardiovascular Surgery
2022;29(08):953-962
- CountryChina
- Language:Chinese
-
Abstract:
Objective To establish a machine learning based framework to rapidly screen out high-risk patients who may develop atrial fibrillation (AF) from patients with valvular heart disease and provide the information related to risk prediction to clinicians as clinical guidance for timely treatment decisions. Methods Clinical data were retrospectively collected from 1 740 patients with valvular heart disease at West China Hospital of Sichuan University and its branches, including 831 (47.76%) males and 909 (52.24%) females at an average age of 54 years. Based on these data, we built classical logistic regression, three standard machine learning models, and three integrated machine learning models for risk prediction and characterization analysis of AF. We compared the performance of machine learning models with classical logistic regression and selected the best two models, and applied the SHAP algorithm to provide interpretability at the population and single-unit levels. In addition, we provided visualization of feature analysis results. Results The Stack model performed best among all models (AF detection rate 85.6%, F1 score 0.753), while XGBoost outperformed the standard machine learning models (AF detection rate 71.9%, F1 score 0.732), and both models performed significantly better than the logistic regression model (AF detection rate 65.2%, F1 score 0.689). SHAP algorithm showed that left atrial internal diameter, mitral E peak flow velocity (Emv), right atrial internal diameter output per beat, and cardiac function class were the most important features affecting AF prediction. Both the Stack model and XGBoost had excellent predictive ability and interpretability. Conclusion The Stack model has the highest AF detection performance and comprehensive performance. The Stack model loaded with the SHAP algorithm can be used to screen high-risk patients for AF and reveal the corresponding risk characteristics. Our framework can be used to guide clinical intervention and monitoring of AF.