Development and validation of a random survival forest model for prognosis prediction in extrahepatic cholangiocarcinoma after radical resection
10.7659/j.issn.1005-6947.250160
- VernacularTitle:基于随机生存森林模型的肝外胆管癌根治术后预后预测模型的构建与验证
- Author:
Shiwei WU
1
;
Zhetai XIAO
;
Zhanyu QIN
;
Boyu WANG
;
Yang SHI
Author Information
1. 苏州大学附属第一医院普通外科,江苏苏州 215000
- Publication Type:Journal Article
- Keywords:
Bile Duct Neoplasms;
Bile Ducts,Extrahepatic;
Machine Learning;
Prognostic Model
- From:
Chinese Journal of General Surgery
2025;34(8):1696-1708
- CountryChina
- Language:Chinese
-
Abstract:
Background and Aims:Extrahepatic cholangiocarcinoma(ECCA)is a malignancy with insidious onset,strong invasiveness,and poor prognosis,characterized by a high postoperative recurrence rate and a 5-year overall survival of less than 20%.Most existing prognostic models are based on the Cox proportional hazards model,which is limited by the proportional hazards assumption and linearity constraints.The random survival forest(RSF)model,a novel machine learning algorithm,can capture complex interactions and nonlinear effects among variables;however,its application in ECCA remains scarce.Therefore,this study developed a prognostic model for ECCA patients after radical resection using the RSF algorithm,aiming to provide precise and individualized prognostic assessments and support clinical decision-making.Methods:A total of 515 postoperative ECCA patients from the SEER database(2016-2021)were retrospectively enrolled and randomly divided into a training set(n=361)and a test set(n=154).Demographic and clinical variables were collected.Cox models were developed using univariate and multivariate regression,while RSF models were constructed using variable importance(VIMP)and minimal depth methods.Model performance was evaluated using the concordance index(C-index),time-dependent area under the curve(AUC),Brier scores,calibration plots,and decision curve analysis.Survival differences were assessed using Kaplan-Meier analysis,and interpretability was enhanced through the use of SurvSHAP and SurvLIME.Results:Multivariate Cox regression identified seven independent prognostic factors:age,race,income,T stage,N stage,tumor size,and chemotherapy.The RSF model selected four key predictors:age,tumor size,lymph node positive rate,and chemotherapy.In the test cohort,the RSF model achieved a C-index of 0.751,outperforming the Cox model(0.711).The RSF model yielded AUCs of 0.843,0.749,and 0.814 at 1,2,and 3 years,respectively,with superior calibration,overall performance,and net clinical benefit.Nonlinear associations were observed for lymph node positive rate,age,and tumor size,while chemotherapy was associated with reduced mortality risk.Stratified survival curves indicated poorer prognosis in patients without chemotherapy,lymph node positive rate>0.1,age>70 years,or tumor size>20 mm.Conclusion:The RSF model,based on only four readily available clinical variables,demonstrated superior predictive performance compared with the Cox model.It provides a reliable tool for individualized prognosis and postoperative management in ECCA patients.The integration of interpretability frameworks further enhances its clinical applicability,offering potential to improve survival outcomes and quality of life.