A machine learning-based model for predicting the risk of diabetic kidney disease in type 2 diabetes mellitus
10.3969/j.issn.1006-6187.2025.04.001
- VernacularTitle:基于机器学习比较2型糖尿病患者发生糖尿病肾脏疾病风险预测模型的研究
- Author:
Tingting LI
1
;
Peng SU
;
Jinbo CHEN
;
Xiaoyan HE
;
Yi CAO
;
Xin ZHANG
;
Qingling TANG
;
Xubin MIAO
;
Xiaohua LIANG
;
Dong MA
Author Information
1. 063210 唐山,华北理工大学公共卫生学院
- Publication Type:Journal Article
- Keywords:
Diabetes mellitus,type 2;
Diabetic kidney disease;
Complication;
Machine learning;
Prediction model
- From:
Chinese Journal of Diabetes
2025;33(4):241-247
- CountryChina
- Language:Chinese
-
Abstract:
Objective To compare and find an optimal model for predicting the risk of DKD occurrence in patients with type 2 diabetes mellitus(T2DM).Methods A total of 2005 patients with T2DM were enrolled in this study from The Second Hospital of Shijiazhuang City during December 2017 to December 2022.All the subjects were divided into a training set(n=1403)and a validation set(n=602)according to the ratio of 3∶1 by simple random sampling.With the occurrence of DKD as the outcome variablein the training set,important feature variables were screened by LASSO regression.Six different machine learning models were established according to the feature variables,thenthe optimal model was determined by comparison,and anonlinerisk predictor for DKD occurrence was constructed in patients with T2DM.Results Taking the occurrence of DKD as the outcome variable in the training set,the results of LASSO regression analysis showed that the optimal value of the model was 10-fold cross validation lambda.1se=0.01662473,and 15 characteristic variables with nonzero coefficient were screened out to be related to the occurrence of DKD.The data included sex,age,family history of DM,DM duration,LDL-C,HbA1c,WBC,PDW,Scr,urine α1-microglobulin,urine β2-microglobulin,urine microalbumin,hypertension,hypokalemia,and DR.In the training set and validation set,the prediction performance of XGBoost model was better than that of other models(AUC=0.872,0.893,95%CI 0.853~0.891,0.865~0.921),the sensitivity was 0.779,0.863,and the specificity was 0.721,0.758,respectively.The F1 scores were 0.774 and 0.787.DCA analysis showed that the XGBoost model had a greater net benefit and threshold probability.According to the XGBoost model,the online predictor of DKD risk in T2DM patients was laid out,and two patients were selected for application,the results showed that the predictive value of the model was 0.185 in non-DKD patients,and the predictive value was 0.510 in DKD patients.Conclusions The XGBoost model is the best model for predicting the occurrence of DKD in T2DM patients,and an online predictor was successfully built.