Construction of a classification model for image subtypes based on the radiomics features of patients with dermatomyositis/polymyositis-related interstitial lung disease for machine learning
10.3760/cma.j.cn141217-20230205-00032
- VernacularTitle:基于皮肌炎/多发性肌炎相关间质性肺疾病患者影像组学特征构建机器学习影像分型的分类模型
- Author:
Chunhui LI
1
;
Liyu HE
;
Jingping ZHANG
;
Tingting HAN
;
Bingjie ZHU
;
Youmin GUO
;
Chenwang JIN
Author Information
1. 西安交通大学第一附属医院影像科,西安 710061
- Keywords:
Dermatomyositis;
Polymyositis;
Lung diseases, interstitial;
Interstitial lung disease;
Imaging genomics;
Machine learning
- From:
Chinese Journal of Rheumatology
2023;27(8):521-526,C8-2
- CountryChina
- Language:Chinese
-
Abstract:
Objective:To investigate the feasibility of classifying imaging patterns of dermatomyositis/polymyositis-related interstitial lung disease (DM/PM-ILD) into subtypes based on chest CT radiomics features and a model was constructed by machine learning algorithms.Methods:From November 2011 to November 2020, 107 patients diagnosed with PM/DM-ILD at the First Affiliated Hospital of Xi′an Jiaotong University were retrospectively analyzed. A total of 315 cases with chest CT were collected. Doctors pre-classified image patterns, including 105 cases with non-specific interstitial pneumonia (NSIP), 90 cases with organizing pneumonia (OP), and 66 cases with non-specific interstitial pneumonia combined with organizing pneumonia (NSIP+OP), 35 cases with common interstitial pneumonia (UIP), and 19 cases with diffuse alveolar damage (DAD), ANOVA was used to test the difference of baseline clinical information among the imaging classification groups. All images were divided into the training set and the est set by stratified random sampling at a ratio of 4∶1. In each CT scan, 3D slicer was used to segment each lung lobe, and then reconstructed into 3 mm 3 of voxels, and Pyradiomics library was used to extract the radiomic features of the whole lung and each lobe. The multi-classification goal was achieved by constructing random forest base classifiers for each of the five groups and then voting as the final model. In the process of constructing the base classifier, firstly, the balance between sample groups was achieved by SMOTETomek comprehensive sampling, and the optimal feature set was selected by independent sample t test and L1 regularized least absolute shrinkage and selection operator (LASSO) regression. In this study, the Radiomics model was constructed based on chest CT radiomics features, and the Radiomics + model was constructed by introducing gender and age information. The base classifier and the integration model use the mean accuracy and the area under the receiver operator characteristics analysis curve (AUC) to evaluate the performance, respectively. Results:There was a statistically significant difference ( P<0.05) between the ages of the NSIP, OP, NSIP+OP, UIP, and DAD groups [(57±13),(53±8),(54±10),(44±11), and (46±8)years old, respectively], F=11.82, P<0.001. In the Radiomics model, for each group of NSIP, OP, NSIP+OP, UIP, and DAD, the AUCs of the training set were 0.87, 0.91, 0.91, 0.96, and 0.99, respectively, and the AUC of the test set were 0.81, 0.82, 0.79, 0.93, 0.89. In the final Radiomics + model, for each group of NSIP, OP, NSIP+OP, UIP, and DAD, the AUCs of the training set were 0.89, 0.91, 0.92, 0.97, and 0.99, respectively, and the AUCs of the test set were 0.84, 0.82, 0.78, 0.94, 0.90. Conclusion:Based on chest CT radiomics features and key clinical features (sex, age), the Radiomics + model constructed by machine learning has good classification performance for the imaging patterns of PM/DM-LD.