Construction of postoperative prognostic model for primary liver cancer based on SMOTE and machine learning
10.16016/j.2097-0927.202310052
- VernacularTitle:基于SMOTE算法和机器学习模型建立原发性肝癌术后的预后预测模型
- Author:
Bi PAN
1
;
Jinghua YU
;
Yixian HUANG
;
Yazhou WU
;
Fang LI
Author Information
1. 400038 重庆,陆军军医大学(第三军医大学)军事预防医学系军队卫生统计学教研室
- Keywords:
primary liver cancer;
SMOTE algorithm;
machine learning;
prediction model
- From:
Journal of Army Medical University
2024;46(19):2236-2240
- CountryChina
- Language:Chinese
-
Abstract:
Objective To construct a prognosis prediction model of primary liver cancer after surgical treatment based on synthetic minority over-sampling technique(SMOTE)algorithm and machine learning model.Methods A retrospective cohort study was conducted on 4 297 patients with primary liver cancer from the surveillance,epidemiology,and end results(SEER)database.One-Hot Encoding and Multiple Imputation were used to preprocess the collect data,and SMOTE algorithm was employed to solve the imbalance of data categories.The obtained clinical variables were included in the machine learning model.Based on decision tree(DT),random forest(RF),gradient boosting decision tree(GBDT)and eXtreme Gradient Boosting(XGBoost),a prognostic prediction model(SMOTE+DT/RF/GBDT/XGBoost)was build,and then the best prediction model was determined by comparing the performance of various models.Finally,a prognostic analysis system for primary liver cancer was developed based on the optimal model,which was then visualized.Results The combination model SMOTE+RF showed the best predictive performance,with higher area under the curve(0.895),accuracy(0.811)and precision(0.806)than those of other models in receiver operating characteristic curve(ROC)analysis.Conclusion The SMOTE+RF prognostic prediction model can effectively predict the survival outcome of patients with primary liver cancer.