Risk Identification Model of Coronary Artery Stenosis Constructed Based on Random Forest
10.13471/j.cnki.j.sun.yat-sen.univ(med.sci).2025.0116
- VernacularTitle:基于随机森林的冠状动脉狭窄风险识别模型
- Author:
Yongfeng LV
1
;
Yujing WANG
2
;
Leyi ZHANG
2
;
Yixin LI
2
;
Na YUAN
3
;
Jing TIAN
4
Author Information
1. The First Clinical Medical College of Shanxi Medical University,Taiyuan 030000, China
2. Academy of Medical Sciences, Shanxi Medical University,Taiyuan 030000, China
3. Department of Health Statistics, School of Public Health, Shanxi Medical University,Taiyuan 030000, China
4. Department of Cardiovascular Medicine, The First Hospital of Shanxi Medical University,Taiyuan 030000, China
- Publication Type:Journal Article
- Keywords:
Gensini score;
back propagation neural network;
random forests;
coronary artery stenosis;
machine learning
- From:
Journal of Sun Yat-sen University(Medical Sciences)
2025;46(1):138-146
- CountryChina
- Language:Chinese
-
Abstract:
ObjectiveTo establish a risk recognition model for coronary artery stenosis by using a machine learning method and to identify the key causative factors. MethodsPatients aged ≥18 years,diagnosed with coronary heart disease through coronary angiography from January 2013 to May 2020 in two prominent hospitals in Shanxi Province, were continuously enrolled. Logistic regression,back propagation neural network (BPNN), and random forest(RF)algorithms were used to construct models for detecting the causative factors of coronary artery stenosis. Sensitivity (TPR), specificity (TNR), accuracy (ACC), positive predictive value (PV+), negative predictive value (PV-), area under subject operating characteristic curve (AUC), and calibration curve were used to compare the discrimination and calibration performance of the models. The best model was then employed to predict the main risk variables associated with coronary stenosis. ResultsThe RF model exhibited superior comprehensive performance compared to logistic regression and BPNN models. The TPR values for logistic regression,BPNN,and RF models were 75.76%, 74.30%, and 93.70%, while ACC values were 74.05%, 72.30%, and 79.49%, respectively. The AUC values were:logistic regression 0.739 9; BPNN 0.723 1; RF 0.752 2. Manifestations such as chest pains,abnormal ST segments on ECG,ventricular premature beats with hypertension, atrial fibrillation, regional wall motion abnormalities(RWMA) by color echocardiography, aortic regurgitation(AR), pulmonary insufficiency (PI), family history of cardiovascular diseases,and body mass index(BMI)were identified as top ten important variables affecting coronary stenosis according to the RF model. ConclusionsRandom forest model shows the best comprehensive performance in identification and accurate assessment of coronary artery stenosis. The prediction of risk factors affecting coronary artery stenosis can provide a scientific basis for clinical intervention and help to formulate further diagnosis and treatment strategies so as to delay the disease progression.