Establishment of a risk prediction model for patients with type 2 diabetes and coronary heart disease based on machine learning of laboratory data
10.3969/j.issn.1673-4130.2025.02.002
- VernacularTitle:基于检验数据的机器学习建立2型糖尿病患者合并冠心病的风险预测模型
- Author:
Zhichao GU
1
;
Yunzhe WU
;
Fan YANG
;
Yide LU
Author Information
1. 上海交通大学医学院附属瑞金医院检验科,上海 200025
- Keywords:
machine learning;
type 2 diabetes;
coronary heart disease;
predictive model
- From:
International Journal of Laboratory Medicine
2025;46(2):135-140
- CountryChina
- Language:Chinese
-
Abstract:
Objective To analyze the characteristics of clinical indicators in patients with type 2 diabetes,and to establish a simple and effective risk prediction model for type 2 diabetes complicated with coronary heart disease by screening risk prediction indicators with machine learning.Methods A retrospective study was conducted,and 217 patients diagnosed with coronary artery disease combined with type 2 diabetes mellitus who were hospitalized in the Hospital from January 2022 to November 2023 were selected.Additionally,214 patients diagnosed with T2DM during the same period in the outpatient department were selected as the con-trol group.Their routine laboratory test data were recorded.The Least Absolute Shrinkage and Selection Op-erator(Lasso)algorithm was used to select features,and the models were built by using seven machine learn-ing algorithms:Random Forest,Decision Tree,Support Vector Machine,eXtreme Gradient Boosting,Logistic Regression,K-Nearest Neighbor,and Artificial Neural Network.The diagnostic efficacy of different models through receiver operating characteristic curve(ROC),area under curve(AUC),calibration curve,specificity,sensitivity,F1 value,and other indicators were evaluated.Results Twenty key factors,including age,gender,systolic blood pressure,diastolic blood pressure,heart rate,C-reactive protein and blood glucose were selected using Lasso regression.When incorporated into various models,the SVM model exhibited the highest sensitiv-ity(88.37%),negative predictive value(82.14%),and area under curve(0.845).The Random Forest model had the highest accuracy(76.47%),positive predictive value(76.74%),and F1 score(0.77).Meanwhile,the XGBoost algorithm demonstrated relatively good specificity(80.95%).After introducing the SHAP model,it was inferred that blood glucose had a significant positive impact on the occurrence of coronary heart disease in individuals with type 2 diabetes.Conclusion Machine learning can serve as an effective tool for assessing the risk of coronary heart disease in patients with type 2 diabetes.In this study,SVM,Random Forest,and XG-Boost models all demonstrate good predictive performance,indicating promising clinical application prospects.