Construction of machine learning-based prediction model for clinically relevant delayed gastric emptying after LPD
10.3760/cma.j.cn113884-20240911-00276
- VernacularTitle:基于机器学习的LPD术后临床相关胃排空延迟风险预测模型的构建
- Author:
Jizhen LI
1
;
Hengli ZHU
;
Qingan FU
;
Changqian TANG
;
Xingbo WEI
;
Chiyu CAI
;
Liancai WANG
;
Dongxiao LI
;
Deyu LI
Author Information
1. 郑州大学人民医院肝胆胰腺外科,郑州 450003
- Publication Type:Journal Article
- Keywords:
Laparoscopes;
Pancreaticoduodenectomy;
Clinically relevant delayed gastric emptying;
Machine learning
- From:
Chinese Journal of Hepatobiliary Surgery
2025;31(2):101-106
- CountryChina
- Language:Chinese
-
Abstract:
Objective:To analyze the risk factors for clinically relevant delayed gastric emptying (CR-DGE) following laparoscopic pancreaticoduodenectomy (LPD) and to develop a model to predict the postoperative CR-DGE after LPD using the machine-learning approach with multi-model comparison.Methods:Clinical data of 278 patients with tumors located in the pancreatic head and periampullary region undergoing LPD at People’s Hospital of Zhengzhou University from January 2019 to December 2023 were retrospectively analyzed, including 167 males and 111 females, aged 59 (53, 66) years. According to the occurrence of DGE, patients were divided into the CR-DGE group ( n=94) and the non-CR-DGE group ( n=184). Main clinical characteristics were compared between the groups, including pancreatic duct diameter, intraoperative blood loss and operative time. The perioperative indicators were selected using the least absolute shrinkage and selection operator (LASSO) algorithm. Following variable selection, 278 patients were allocated into a training set ( n=222) and a validation set ( n=56) in an 8∶2 ratio. Eight machine learning models were selected to model the training set: random forest, adaptive boosting, light gradient boosting, multilayer perceptron, support vector machine, K-nearest neighbor algorithm, decision tree and complementary set plain bayes. The area under the curve (AUC) of receiver operating characteristic curve of the validation set was utilized to identify the optimal model. The predictive performance of the optimal model was evaluated using calibration plots and decision curve analysis (DCA). The contribution of each feature to the prediction is assessed using Shapley additive explanation (SHAP). Results:Univariate analysis showed statistically significant differences between the CR-DGE and non-CR-DGE groups in terms of age [66(62, 69) years vs. 56(51, 60), years], diabetes [42.6%(40/94) vs. 11.4%(21/184)], level of fibrinogen [3.43(2.74, 4.18) g/L vs. 3.84(3.19, 4.68) g/L], pancreatic duct diameter [2.00(1.50, 2.70) mm vs. 3.40(1.60, 5.00) mm], intraoperative blood loss [300(200, 600) ml vs. 200(150, 300) ml], operative time [472(430, 502) min vs. 430(365, 475) min], clinically relevant postoperative pancreatic fistula [34.0%(32/94) vs. 3.8%(7/184)], abdominal fluid accumulation [46.8%(44/94) vs. 12.5%(23/184)], postoperative hemorrhage [20.2%(19/94) vs. 3.3%(6/184)], abdominal infection [28.7%(27/94) vs. 11.4% (21/184)] and duration of postoperative gastrointestinal decompression [4.00 (2.00, 6.00) d vs. 3.00 (2.00, 5.00) d] (all P<0.05). The eleven variables selected via LASSO were incorporated into each of the eight machine learning models. Results demonstrated that the random forest model achieved the highest performance in the validation set, with an AUC of 0.894 (95% CI: 0.800-0.985), accuracy of 0.820 and sensitivity of 0.606. Calibration plots and DCA confirmed the robustness of the random forest model. SHAP analysis indicated that age, pancreatic duct diameter and preoperative aspartate aminotransferase were important predictors in the random forest model. Conclusion:The random forest model developed in this study demonstrated a good predictive performance for CR-DGE after LPD and may assist in the early identification of high-risk patients in clinical practice.