Plasma-metabolite-based machine learning is a promising diagnostic approach for esophageal squamous cell carcinoma investigation
- Author:
Chen ZHONGJIAN
1
,
2
;
Huang XIANCONG
;
Gao YUN
;
Zeng SU
;
Mao WEIMIN
Author Information
1. Laboratory of Pharmaceutical Analysis and Drug Metabolism,College of Pharmaceutical Sciences,Zhejiang University,Hangzhou,310058,China
2. The Cancer Research Institute,The Cancer Hospital of the University of Chinese Academy of Sciences (Zhejiang Cancer Hospital),Institute of Basic Medicine and Cancer (IBMC),Chinese Academy of Sciences,Hangzhou,310022,China
- Keywords:
Diagnostic;
Esophageal squamous cell carcinoma (ESCC);
Metabolomics;
Machine learning;
Prognostic
- From:
Journal of Pharmaceutical Analysis
2021;11(4):505-514
- CountryChina
- Language:Chinese
-
Abstract:
The aim of this study was to develop a diagnostic strategy for esophageal squamous cell carcinoma(ESCC) that combines plasma metabolomics with machine learning algorithms.Plasma-based untargeted metabolomics analysis was performed with samples derived from 88 ESCC patients and 52 healthy controls.The dataset was split into a training set and a test set.After identification of differential me-tabolites in training set,single-metabolite-based receiver operating characteristic (ROC) curves and multiple-metabolite-based machine learning models were used to distinguish between ESCC patients and healthy controls.Kaplan-Meier survival analysis and Cox proportional hazards regression analysis were performed to investigate the prognostic significance of the plasma metabolites.Finally,twelve differential plasma metabolites (six up-regulated and six down-regulated) were annotated.The pre-dictive performance of the six most prevalent diagnostic metabolites through the diagnostic models in the test set were as follows:arachidonic acid (accuracy:0.887),sebacic acid (accuracy:0.867),indoxyl sulfate (accuracy:0.850),phosphatidylcholine (PC) (14:0/0:0) (accuracy:0.825),deoxycholic acid(accuracy:0.773),and trimethylamine N-oxide (accuracy:0.653).The prediction accuracies of the ma-chine learning models in the test set were partial least-square (accuracy:0.947),random forest (accu-racy:0.947),gradient boosting machine (accuracy:0.960),and support vector machine (accuracy:0.980).Additionally,survival analysis demonstrated that acetoacetic acid was an unfavorable prognostic factor(hazard ratio (HR):1.752),while PC (14:0/0:0) (HR:0.577) was a favorable prognostic factor for ESCC.This study devised an innovative strategy for ESCC diagnosis by combining plasma metabolomics with machine learning algorithms and revealed its potential to become a novel screening test for ESCC.