Establishment and evaluation of the model for predicting lung cancer occurrence in COPD patients based on XGBoost
- VernacularTitle:基于XGBoost的COPD患者肺癌发生预测模型的建立与评价
- Author:
Jing YANG
1
;
Tong JIAO
;
Yujiao DONG
;
Chenyu YAO
;
Qunyu KONG
;
Jie SHI
;
Shuanying YANG
Author Information
- Publication Type:Journal Article
- Keywords: chronic obstructive pulmonary disease(COPD); risk assessment; prediction model; XGBoost; SHAP
- From: Journal of Xi'an Jiaotong University(Medical Sciences) 2025;46(2):345-352
- CountryChina
- Language:Chinese
- Abstract: Objective To construct an XGBoost predictive model using clinical characteristic data from patients with chronic obstructive pulmonary disease(COPD)and evaluate the efficacy of the predictive model in early risk prediction of lung cancer occurrence in COPD patients.Methods In this retrospective cross-sectional study,cluster sampling was used.We selected clinically diagnosed COPD patients admitted to The Second Affiliated Hospital of Xi'an Jiaotong University from January 1,2018,to December 31,2022.A total of 4 008 patients with complete data were included.First,the baseline of each characteristic was analyzed,and then XGBoost was used to construct the lung cancer risk prediction model for COPD patients,and SHAP(SHapley Additive exPlanation)value was used to quantify and attribute the importance of each characteristic.DC A curve was used to evaluate the clinical application value.Results After constructing a lung cancer risk model for COPD patients using 28 variables,eight variables were selected according to the importance of the variables and clinical experience,and the prediction model was reconstructed.The model efficacy in the training set and the test set was 0.948(0.938,0.958)and 0.797(0.738,0.856),respectively.SHAP diagram showed that elevated CEA,CA125,FIB,eosinophils,PLT and D-dimer and reduced TT all contributed to an increased risk of lung cancer in COPD patients.DCA curve showed that the prediction model had clinical application value,which could help doctors make more accurate prognosis prediction and treatment decisions.Conclusion The successful establishment of an XGBoost predictive model,utilizing a subset of features,enables early prediction of lung cancer occurrence in COPD patients.
