Application Value of an AI-based Imaging Feature Parameter Model
for Predicting the Malignancy of Part-solid Pulmonary Nodule.
10.3779/j.issn.1009-3419.2025.102.13
- Author:
Mingzhi LIN
1
;
Yiming HUI
1
;
Bin LI
1
;
Peilin ZHAO
1
;
Zhizhong ZHENG
1
;
Zhuowen YANG
1
;
Zhipeng SU
1
;
Yuqi MENG
1
;
Tieniu SONG
1
Author Information
1. Department of Thoracic Surgery, The Second Hospital & Clinical Medical School, Lanzhou University, Lanzhou 730030, China.
- Publication Type:Journal Article
- Keywords:
Artificial intelligence;
Lung neoplasms;
Machine learning;
Part-solid nodule;
Prediction model
- MeSH:
Humans;
Male;
Female;
Lung Neoplasms/pathology*;
Middle Aged;
Retrospective Studies;
Artificial Intelligence;
Aged;
Tomography, X-Ray Computed;
Adult;
Solitary Pulmonary Nodule/diagnostic imaging*;
ROC Curve
- From:
Chinese Journal of Lung Cancer
2025;28(4):281-290
- CountryChina
- Language:Chinese
-
Abstract:
BACKGROUND:Lung cancer is one of the most common malignant tumors worldwide and a major cause of cancer-related deaths. Early-stage lung cancer is often manifested as pulmonary nodules, and accurate assessment of the malignancy risk is crucial for prolonging survival and avoiding overtreatment. This study aims to construct a model based on image feature parameters automatically extracted by artificial intelligence (AI) to evaluate its effectiveness in predicting the malignancy of part-solid nodule (PSN).
METHODS:This retrospective study analyzed 229 PSN from 222 patients who underwent pulmonary nodule resection at Lanzhou University Second Hospital between October 2020 and February 2025. According to pathological results, 45 cases of benign lesions and precursor glandular lesion were categorized into the non-malignant group, and 184 cases of pulmonary malignancies were categorized into the malignant group. All patients underwent preoperative chest computed tomography (CT), and AI software was used to extract imaging feature parameters. Univariate analysis was used to screen significant variables; variance inflation factor (VIF) was calculated to exclude highly collinear variables, and LASSO regression was further applied to identify key features. Multivariate Logistic regression was used to determine independent risk factors. Based on the selected variables, five models were constructed: Logistic regression, random forest, XGBoost, LightGBM, and support vector machine (SVM). Receiver operating characteristic (ROC) curves were used to assess the performance of the models.
RESULTS:The independent risk factors for the malignancy of PSN include roughness (ngtdm), dependence variance (gldm), and short run low gray-level emphasis (glrlm). Logistic regression achieved area under the curves ( AUCs) of 0.86 and 0.89 in the training and testing sets, respectively, showing good performance. XGBoost had AUCs of 0.78 and 0.77, respectively, demonstrating relatively balanced performance, but with lower accuracy. SVM showed an AUC of 0.93 in the training set, which decreased to 0.80 in the testing set, indicating overfitting. LightGBM performed excellently in the training set with an AUC of 0.94, but its performance declined in the testing set, with an AUC of 0.88. In contrast, random forest demonstrated stable performance in both the training and testing sets, with AUCs of 0.89 and 0.91, respectively, exhibiting high stability and excellent generalizability.
CONCLUSIONS:The random forest model constructed based on independent risk factors demonstrated the best performance in predicting the malignancy of PSN and could provide effective auxiliary predictions for clinicians, supporting individualized treatment decisions.
.