Efficacy of machine learning models versus Cox regression model for predicting prognosis of esophagogastric junction adenocarcinoma.
10.12122/j.issn.1673-4254.2023.06.10
- Author:
Kaiji GAO
1
;
Yihao WANG
1
;
Haikun CAO
1
;
Jianguang JIA
1
Author Information
1. Department of Surgical Oncology, First Affiliated Hospital of Bengbu Medical College, Bengbu 233000, China.
- Publication Type:Journal Article
- Keywords:
Cox proportional hazard regression model;
artificial intelligence;
esophagogastric junction adenocarcinoma;
machine learning
- MeSH:
Humans;
Adenocarcinoma;
Prognosis;
Machine Learning;
Esophagogastric Junction
- From:
Journal of Southern Medical University
2023;43(6):952-963
- CountryChina
- Language:Chinese
-
Abstract:
OBJECTIVE:To compare the performance of machine learning models and traditional Cox regression model in predicting postoperative outcomes of patients with esophagogastric junction adenocarcinoma (AEG).
METHODS:This study was conducted among 203 AEG patients with complete clinical and follow-up data, who were treated in our hospital between September, 2015 and October, 2020. The clinicopathological data of the patients were processed for analysis using R language package and divided into training and validation datasets at the ratio of 3:1. The Cox proportional hazards regression model and 4 machine learning models were constructed for analyzing the datasets. ROC curves, calibration curves and clinical decision curves (DCA) were plotted. Internal validation of the machine learning models was performed to assess their predictive efficacy. The predictive performance of each model was evaluated by calculating the area under the curve (AUC), and the model fitting was assessed using the calibration curve.
RESULTS:For predicting 3-year survival based on the validation dataset, the AUC was 0.870 for Cox proportional hazard regression model, 0.901 for eXtreme Gradient Boosting (XGBoost), 0.791 for random forest, 0.832 for support vector machine, and 0.725 for multilayer perceptron; For predicting 5-year survival, the AUCs of these models were 0.915, 0.916, 0.758, 0.905, and 0.737, respectively. For internal validation, the AUCs of the 4 machine learning models decreased in the order of XGBoost (0.818), random forest (0.758), support vector machine (0.0.804), and multilayer perceptron (0.745).
CONCLUSION:The machine learning models show better predictive efficacy for survival outcomes of patients with AEG than Cox proportional hazard regression model, especially when proportional odds assumption or linear regression models are not applicable. XGBoost models have better performance than the other machine learning models, and the multi-layer perception model may have poor fitting results for a limited data volume.