Construction and validation of a machine learning model for preoperative prediction of perineural invasion status in intrahepatic cholangiocarcinoma
10.3760/cma.j.cn113884-20231227-00183
- VernacularTitle:术前预测肝内胆管癌患者神经侵犯状态机器学习模型的构建和验证
- Author:
Zuochao QI
1
;
Zhenwei YANG
;
Qingshan LI
;
Hao YUAN
;
Pengyu CHEN
;
Haofeng ZHANG
;
Yanbo WANG
;
Dongxiao LI
;
Bo MENG
;
Haibo YU
;
Deyu LI
Author Information
1. 郑州大学人民医院肝胆胰腺外科,郑州 450003
- Keywords:
Bile duct neoplasms;
Intrahepatic cholangiocarcinoma;
Perineural invasion;
Machine learning;
Predictive model
- From:
Chinese Journal of Hepatobiliary Surgery
2024;30(6):424-430
- CountryChina
- Language:Chinese
-
Abstract:
Objective:To construct and validate a machine learning model for preoperative prediction of perineural invasion (PNI) status in intrahepatic cholangiocarcinoma (ICC).Methods:Clincial data of 329 patients, including 245 admitted to Zhengzhou University People's Hospital from January 2018 to June 2023 and 84 admitted to the Affiliated Cancer Hospital of Zhengzhou University from January 2013 to January 2020 were retrospectively analyzed. Patients were divided into a training set ( n=231) and a validation set ( n=98). Clinicopathological data including age, gender, hepatitis B virus (HBV) infection status were collected. Predictive variables were determined using least absolute shrinkage and selection operator (LASSO) regression analysis. Six machine learning algorithms including random forest (RF), logistic regression, and linear kernel-based support vector machine were selected to construct the preoperative prediction model for PNI in ICC. Performance metrics of the model were calculated using a confusion matrix, and the final model was selected. The model performance was evaluated in the validation set. Calibration curves were plotted to evaluate the final model, and a Pareto chart was used to visualize the importance of predictive variables. Results:LASSO regression identified nine predictive variables included in the prediction model, including carbohydrate antigen 19-9 (CA19-9), HBV infection status, alkaline phosphatase, alanine aminotransferase, prothrombin time, total bilirubin, albumin, neutrophil times gamma-glutamyl transferase to lymphocyte ratio, and tumor burden score. Among the trained six models, the area under the curve (AUC) of the RF model was 0.909, with a sensitivity of 0.842 and an accuracy of 0.870. Compared with the AUC of the RF model, the AUCs of the other 5 models were lower (all P<0.05). The AUC of the RF model for predicting PNI in ICC in validation set was 0.736. Calibration curves showed good fit of the RF model's prediction of PNI in ICC in both training and validation sets. The Pareto chart showed that CA19-9 was the most important predictive variable in the model, followed by HBV infection status. Conclusion:The machine learning model based on the RF algorithm has a high accuracy in preoperative prediction of PNI status in ICC.