1.A machine learning-based model for predicting the risk of diabetic kidney disease in type 2 diabetes mellitus
Tingting LI ; Peng SU ; Jinbo CHEN ; Xiaoyan HE ; Yi CAO ; Xin ZHANG ; Qingling TANG ; Xubin MIAO ; Xiaohua LIANG ; Dong MA
Chinese Journal of Diabetes 2025;33(4):241-247
Objective To compare and find an optimal model for predicting the risk of DKD occurrence in patients with type 2 diabetes mellitus(T2DM).Methods A total of 2005 patients with T2DM were enrolled in this study from The Second Hospital of Shijiazhuang City during December 2017 to December 2022.All the subjects were divided into a training set(n=1403)and a validation set(n=602)according to the ratio of 3∶1 by simple random sampling.With the occurrence of DKD as the outcome variablein the training set,important feature variables were screened by LASSO regression.Six different machine learning models were established according to the feature variables,thenthe optimal model was determined by comparison,and anonlinerisk predictor for DKD occurrence was constructed in patients with T2DM.Results Taking the occurrence of DKD as the outcome variable in the training set,the results of LASSO regression analysis showed that the optimal value of the model was 10-fold cross validation lambda.1se=0.01662473,and 15 characteristic variables with nonzero coefficient were screened out to be related to the occurrence of DKD.The data included sex,age,family history of DM,DM duration,LDL-C,HbA1c,WBC,PDW,Scr,urine α1-microglobulin,urine β2-microglobulin,urine microalbumin,hypertension,hypokalemia,and DR.In the training set and validation set,the prediction performance of XGBoost model was better than that of other models(AUC=0.872,0.893,95%CI 0.853~0.891,0.865~0.921),the sensitivity was 0.779,0.863,and the specificity was 0.721,0.758,respectively.The F1 scores were 0.774 and 0.787.DCA analysis showed that the XGBoost model had a greater net benefit and threshold probability.According to the XGBoost model,the online predictor of DKD risk in T2DM patients was laid out,and two patients were selected for application,the results showed that the predictive value of the model was 0.185 in non-DKD patients,and the predictive value was 0.510 in DKD patients.Conclusions The XGBoost model is the best model for predicting the occurrence of DKD in T2DM patients,and an online predictor was successfully built.
2.A machine learning-based model for predicting the risk of diabetic kidney disease in type 2 diabetes mellitus
Tingting LI ; Peng SU ; Jinbo CHEN ; Xiaoyan HE ; Yi CAO ; Xin ZHANG ; Qingling TANG ; Xubin MIAO ; Xiaohua LIANG ; Dong MA
Chinese Journal of Diabetes 2025;33(4):241-247
Objective To compare and find an optimal model for predicting the risk of DKD occurrence in patients with type 2 diabetes mellitus(T2DM).Methods A total of 2005 patients with T2DM were enrolled in this study from The Second Hospital of Shijiazhuang City during December 2017 to December 2022.All the subjects were divided into a training set(n=1403)and a validation set(n=602)according to the ratio of 3∶1 by simple random sampling.With the occurrence of DKD as the outcome variablein the training set,important feature variables were screened by LASSO regression.Six different machine learning models were established according to the feature variables,thenthe optimal model was determined by comparison,and anonlinerisk predictor for DKD occurrence was constructed in patients with T2DM.Results Taking the occurrence of DKD as the outcome variable in the training set,the results of LASSO regression analysis showed that the optimal value of the model was 10-fold cross validation lambda.1se=0.01662473,and 15 characteristic variables with nonzero coefficient were screened out to be related to the occurrence of DKD.The data included sex,age,family history of DM,DM duration,LDL-C,HbA1c,WBC,PDW,Scr,urine α1-microglobulin,urine β2-microglobulin,urine microalbumin,hypertension,hypokalemia,and DR.In the training set and validation set,the prediction performance of XGBoost model was better than that of other models(AUC=0.872,0.893,95%CI 0.853~0.891,0.865~0.921),the sensitivity was 0.779,0.863,and the specificity was 0.721,0.758,respectively.The F1 scores were 0.774 and 0.787.DCA analysis showed that the XGBoost model had a greater net benefit and threshold probability.According to the XGBoost model,the online predictor of DKD risk in T2DM patients was laid out,and two patients were selected for application,the results showed that the predictive value of the model was 0.185 in non-DKD patients,and the predictive value was 0.510 in DKD patients.Conclusions The XGBoost model is the best model for predicting the occurrence of DKD in T2DM patients,and an online predictor was successfully built.

Result Analysis
Print
Save
E-mail