Development and Testing of a Machine Learning Model Using 18 F-Fluorodeoxyglucose PET/CT-Derived Metabolic Parameters to Classify Human Papillomavirus Status in Oropharyngeal Squamous Carcinoma
- Author:
Changsoo WOO
1
;
Kwan Hyeong JO
;
Beomseok SOHN
;
Kisung PARK
;
Hojin CHO
;
Won Jun KANG
;
Jinna KIM
;
Seung-Koo LEE
Author Information
- Publication Type:Original Article
- From:Korean Journal of Radiology 2023;24(1):51-61
- CountryRepublic of Korea
- Language:English
-
Abstract:
Objective:To develop and test a machine learning model for classifying human papillomavirus (HPV) status of patients with oropharyngeal squamous cell carcinoma (OPSCC) using 18 F-fluorodeoxyglucose ( 18 F-FDG) PET-derived parameters in derived parameters and an appropriate combination of machine learning methods in patients with OPSCC.
Materials and Methods:This retrospective study enrolled 126 patients (118 male; mean age, 60 years) with newly diagnosed, pathologically confirmed OPSCC, that underwent 18 F-FDG PET-computed tomography (CT) between January 2012 and February 2020. Patients were randomly assigned to training and internal validation sets in a 7:3 ratio. An external test set of 19 patients (16 male; mean age, 65.3 years) was recruited sequentially from two other tertiary hospitals. Model 1 used only PET parameters, Model 2 used only clinical features, and Model 3 used both PET and clinical parameters. Multiple feature transforms, feature selection, oversampling, and training models are all investigated. The external test set was used to test the three models that performed best in the internal validation set. The values for area under the receiver operating characteristic curve (AUC) were compared between models.
Results:In the external test set, ExtraTrees-based Model 3, which uses two PET-derived parameters and three clinical features, with a combination of MinMaxScaler, mutual information selection, and adaptive synthetic sampling approach, showed the best performance (AUC = 0.78; 95% confidence interval, 0.46–1). Model 3 outperformed Model 1 using PET parameters alone (AUC = 0.48, p = 0.047) and Model 2 using clinical parameters alone (AUC = 0.52, p = 0.142) in predicting HPV status.
Conclusion:Using oversampling and mutual information selection, an ExtraTree-based HPV status classifier was developed by combining metabolic parameters derived from 18 F-FDG PET/CT and clinical parameters in OPSCC, which exhibited higher performance than the models using either PET or clinical parameters alone.