Recurrence risk prediction models of postoperative patients with renal cell carcinoma based on machine learning
10.3969/j.issn.1009-8291.2025.03.012
- VernacularTitle:基于机器学习的肾癌患者术后复发风险预测模型的构建与评价
- Author:
Peipei WANG
1
;
Zhao HOU
2
;
Hui MA
2
;
Dingyang LYU
2
;
Qiwei WANG
2
;
Weibing SHUANG
1
Author Information
1. Department of Urology, The First Hospital, Taiyuan 030001, China; Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan 030001, China
2. Department of Urology, The First Hospital, Taiyuan 030001, China
- Publication Type:Journal Article
- Keywords:
renal cell carcinoma;
recurrence;
machine learning;
prediction model;
logistic;
decision tree;
random forest;
Bayes
- From:
Journal of Modern Urology
2025;30(3):240-247
- CountryChina
- Language:Chinese
-
Abstract:
Objective: To explore the influencing factors of recurrence in postoperative patients with renal cell carcinoma,construct machine learning prediction models and evaluate their performance. Methods: Clinical data of 915 patients with renal cell carcinoma treated in our hospital during 2013 and 2021 were retrospectively collected.The data were randomly divided into a training set (n=510) and a validation set (n=218) in a 7∶3 ratio.In the training set,LASSO regression algorithm was used to screen important variables,and machine learning prediction models were constructed to predict the recurrence risk.In the validation set,the effectiveness of the models was compared combined with the area under receiver operating characteristic curve (AUC),accuracy rate,F1 value and other indicators. Results: LASSO regression screened out the risk factors,including smoking history,tumor size,N stage,Fuhrman grade,thrombin time and fibrinogen,based on which,the logistic model,decision tree model,random forest model,and Bayes model were constructed.In the validation set,the AUC of the above 4 models was 0.862,0.792,0.843 and 0.861,respectively; the accuracy was 0.917,0.908,0.904 and 0.927,respectively; F1 value was 0.357,0.286,0.323 and 0.600,respectively.The Bayes model had the most stable performance and best differentiation. Conclusion: In this data set,the prediction model based on Bayes algorithm has a good performance and can provide reference for clinical decision making.