Identification of risk factors for pneumoconiosis-related complications and development and application of an XGBoost-based early prediction model
- VernacularTitle:尘肺病并发症危险因素识别及XGBoost早期预测模型的建立与应用
- Author:
Li ZHANG
1
;
Peng PENG
2
;
Yun WANG
3
;
Dong LUO
1
Author Information
- Publication Type:Investigation
- Keywords: Chongqing; pneumoconiosis; complication; machine learning; ten-fold cross-validation training of extreme gradient boosting model; SHapley Additive exPlanations
- From: Journal of Environmental and Occupational Medicine 2026;43(3):302-310
- CountryChina
- Language:Chinese
-
Abstract:
Background As one of the most severe occupational diseases in China, pneumoconiosis is significantly burdened by its complications, which adversely affects patients' quality of life. Objective To identify the influencing factors of complications in pneumoconiosis and to construct an early prediction model for pneumoconiosis complications, providing theoretical guidance for clinical diagnosis, treatment, and rehabilitation. Methods A case-control study was conducted using data from the Chongqing 5G Pneumoconiosis Rehabilitation Management Information Platform. A total of
1872 pneumoconiosis patients with complications, who received rehabilitation at various rehabilitation stations in Chongqing from January 2021 to December 2024, were enrolled as the case group. Concurrently,2348 patients without complications from the same platform and period were included as the control group. Univariate analysis, LASSO regression, and logistic regression were employed to identify influencing factors for complications. An extreme gradient boosting (XGBoost) model was trained using ten-fold cross-validation to predict pneumoconiosis complications. The SHapley Additive exPlanations (SHAP) method was applied for model visualization. Results The analysis of influencing factors revealed that increased age, silicosis type, advanced stage of pneumoconiosis, dependence in activities of daily living, abnormal muscle strength, increased dyspnea index, presence of cough, viscous sputum, impaired cardiac function, severe abnormality in the 6-minute walk test (6MWT), and increased Borg score were risk factors for complications (P<0.05). The XGBoost model demonstrated an area under the curve (AUC) of receiver operating characteristic of 0.718, and a Brier score of 0.211 for the calibration curve. Decision curve analysis demonstrated a superior net benefit within the threshold probability range of 0.2 to 1.0 compared to "all-patient intervention" and "no-patient intervention" strategies, indicating good predictive performance and significant clinical utility. SHAP visualization identified the top six important features as Borg score, age, 6MWT, muscle strength, pneumoconiosis stage, and cough symptoms. Conclusion Multiple factors influence the occurrence of complications in pneumoconiosis patients. Clinical attention should be focused on elderly patients with decreased cardiopulmonary reserve, pronounced respiratory symptoms, and diminished muscle strength. The XGBoost model demonstrates satisfactory discrimination, calibration, and clinical applicability in predicting pneumoconiosis complications and may serve as a useful reference for clinical decision-making.
