A lung cancer early-warning risk model based on facial diagnosis image features
10.1016/j.dcmed.2025.09.007
- VernacularTitle:基于面部望诊图像特征的肺癌风险预警模型研究
- Author:
Yulin Shi
;
Shuyi Zhang
;
Jiayi Liu
;
Wenlian Chen
;
Lingshuang Liu
;
Ling Xu
;
Jiatuo Xu
- Publication Type:Journal Article
- Keywords:
Inspection;
Facial features;
Lung cancer;
Early-warning risk;
Machine learning
- From:
Digital Chinese Medicine
2025;8(3):351-362
- CountryChina
- Language:English
-
Abstract:
Objective:To explore the feasibility of constructing a lung cancer early-warning risk model based on facial image features, providing novel insights into the early screening of lung cancer.
Methods:This study included patients with pulmonary nodules diagnosed at the Physical Examination Center of Shuguang Hospital Affiliated to Shanghai University of Traditional Chinese Medicine from November 1, 2019 to December 31, 2024, as well as patients with lung cancer diagnosed in the Oncology Departments of Yueyang Hospital of Integrated Traditional Chinese and Western Medicine and Longhua Hospital during the same period. The facial image information of patients with pulmonary nodules and lung cancer was collected using the TFDA-1 tongue and facial diagnosis instrument, and the facial diagnosis features were extracted from it by deep learning technology. Statistical analysis was conducted on the objective facial diagnosis characteristics of the two groups of participants to explore the differences in their facial image characteristics, and the least absolute shrinkage and selection operator (LASSO) regression was used to screen the characteristic variables. Based on the screened feature variables, four machine learning methods: random forest, logistic regression, support vector machine (SVM), and gradient boosting decision tree (GBDT) were used to establish lung cancer classification models independently. Meanwhile, the model performance was evaluated by indicators such as sensitivity, specificity, F1 score, precision, accuracy, the area under the receiver operating characteristic (ROC) curve (AUC), and the area under the precision-recall curve (AP).
Results:A total of 1 275 patients with pulmonary nodules and 1 623 patients with lung cancer were included in this study. After propensity score matching (PSM) to adjust for gender and age, 535 patients were finally included in the pulmonary nodule group and the lung cancer group, respectively. There were significant differences in multiple color space metrics (such as R, G, B, V, L, a, b, Cr, H, Y, and Cb) and texture metrics [such as gray-levcl co-occurrence matrix (GLCM)-contrast (CON) and GLCM-inverse different moment (IDM)] between the two groups of individuals with pulmonary nodules and lung cancer (P < 0.05). To construct a classification model, LASSO regression was used to select 63 key features from the initial 136 facial features. Based on this feature set, the SVM model demonstrated the best performance after 10-fold stratified cross-validation. The model achieved an average AUC of 0.8729 and average accuracy of 0.799 0 on the internal test set. Further validation on an independent test set confirmed the model’s robust performance (AUC = 0.823 3, accuracy = 0.729 0), indicating its good generalization ability. Feature importance analysis demonstrated that color space indicators and the whole/lip Cr components (including color-B-0, wholecolor-Cr, and lipcolor-Cr) were the core factors in the model’s classification decisions, while texture indicators [GLCM-angular second moment (ASM)_2, GLCM-IDM_1, GLCM-CON_1, GLCM-entropy (ENT)_2] played an important auxiliary role.
Conclusion:The facial image features of patients with lung cancer and pulmonary nodules show significant differences in color and texture characteristics in multiple areas. The various models constructed based on facial image features all demonstrate good performance, indicating that facial image features can serve as potential biomarkers for lung cancer risk prediction, providing a non-invasive and feasible new approach for early lung cancer screening.