Plasma-metabolite-based machine learning is a promising diagnostic approach for esophageal squamous cell carcinoma investigation
- Author:
Chen ZHONGJIAN
1
;
Huang XIANCONG
;
Gao YUN
;
Zeng SU
;
Mao WEIMIN
Author Information
1. Laboratory of Pharmaceutical Analysis and Drug Metabolism,College of Pharmaceutical Sciences,Zhejiang University,Hangzhou,310058,China;The Cancer Research Institute,The Cancer Hospital of the University of Chinese Academy of Sciences (Zhejiang Cancer Hospital),Institute of Basic Medicine and Cancer (IBMC),Chinese Academy of Sciences,Hangzhou,310022,China
- Keywords:
Diagnostic;
Esophageal squamous cell carcinoma (ESCC);
Metabolomics;
Machine learning;
Prognostic
- From:
Journal of Pharmaceutical Analysis
2021;11(4):505-514
- CountryChina
- Language:Chinese
-
Abstract:
The aim of this study was to develop a diagnostic strategy for esophageal squamous cell carcinoma(ESCC) that combines plasma metabolomics with machine learning algorithms.Plasma-based untargeted metabolomics analysis was performed with samples derived from 88 ESCC patients and 52 healthy controls.The dataset was split into a training set and a test set.After identification of differential me-tabolites in training set,single-metabolite-based receiver operating characteristic (ROC) curves and multiple-metabolite-based machine learning models were used to distinguish between ESCC patients and healthy controls.Kaplan-Meier survival analysis and Cox proportional hazards regression analysis were performed to investigate the prognostic significance of the plasma metabolites.Finally,twelve differential plasma metabolites (six up-regulated and six down-regulated) were annotated.The pre-dictive performance of the six most prevalent diagnostic metabolites through the diagnostic models in the test set were as follows:arachidonic acid (accuracy:0.887),sebacic acid (accuracy:0.867),indoxyl sulfate (accuracy:0.850),phosphatidylcholine (PC) (14:0/0:0) (accuracy:0.825),deoxycholic acid(accuracy:0.773),and trimethylamine N-oxide (accuracy:0.653).The prediction accuracies of the ma-chine learning models in the test set were partial least-square (accuracy:0.947),random forest (accu-racy:0.947),gradient boosting machine (accuracy:0.960),and support vector machine (accuracy:0.980).Additionally,survival analysis demonstrated that acetoacetic acid was an unfavorable prognostic factor(hazard ratio (HR):1.752),while PC (14:0/0:0) (HR:0.577) was a favorable prognostic factor for ESCC.This study devised an innovative strategy for ESCC diagnosis by combining plasma metabolomics with machine learning algorithms and revealed its potential to become a novel screening test for ESCC.