Cancer staging diagnosis based on transcriptomics and variational autoencoder
10.16016/j.2097-0927.202412079
- VernacularTitle:基于转录组学和变分自编码器的癌症分期诊断研究
- Author:
Jiarui LI
1
;
Li QIAN
;
Junjie SHEN
;
Honglin GUO
;
Maoyang QIN
;
Yazhou WU
Author Information
1. 陆军军医大学(第三军医大学)军事预防系医学系军队卫生统计学教研室
- Keywords:
cancer staging;
transcriptomics;
variational autoencoder;
machine learning
- From:
Journal of Army Medical University
2025;47(6):613-622
- CountryChina
- Language:Chinese
-
Abstract:
Objective To conduct an in-depth analysis and feature extraction of the transcriptomics data of 10 types of cancers in order to realize the staging diagnosis of cancer samples.Methods The transcriptomics data of the top 10 cancers having the highest incidence were amassed from the UCSC Xena website,which comprised 4 938 samples and 59 428 genes.With the aid of variational autoencoder,we developed an incremental feature ranking and selection variational autoencoder(IFRSVAE)based on feature importance ranking and incorporating the masking algorithm and the Incremental Feature Selection(IFS).Subsequently,the performance efficiency of our IFRSVAE model was evaluated in conjunction with Random Forest(RF),Support Vector Machine(SVM),and eXtreme Gradient Boosting(XGboost),and it was also compared with other methods.Results Our research extracted 21 features for the ensuing classification.In comparison to the conventional variational autoencoder,recursive feature elimination,and Lasso regression models,the IFRSVAE model attained more favorable performance across all 3 classifiers(highest AUC value,and well performed other indicators).Notably,the IFRSVAE-RF exhibited the most outstanding performance,with an AUC value reaching 85.49%(95%CI:83.24%~87.74%).Moreover,Shapley additive explanations(SHAP)interpretable model illustrated well contributions of the features in our model.Conclusion Our developed IFRSVAE shows certain effectiveness in feature extraction.The constructed IFRSVAE-RF model demonstrates relatively good performance in the task of cancer staging diagnosis,which providing a new and referable idea for research orientation of deep-learning-based diagnostic methods for cancer staging.