1.Cancer staging diagnosis based on transcriptomics and variational autoencoder
Jiarui LI ; Li QIAN ; Junjie SHEN ; Honglin GUO ; Maoyang QIN ; Yazhou WU
Journal of Army Medical University 2025;47(6):613-622
Objective To conduct an in-depth analysis and feature extraction of the transcriptomics data of 10 types of cancers in order to realize the staging diagnosis of cancer samples.Methods The transcriptomics data of the top 10 cancers having the highest incidence were amassed from the UCSC Xena website,which comprised 4 938 samples and 59 428 genes.With the aid of variational autoencoder,we developed an incremental feature ranking and selection variational autoencoder(IFRSVAE)based on feature importance ranking and incorporating the masking algorithm and the Incremental Feature Selection(IFS).Subsequently,the performance efficiency of our IFRSVAE model was evaluated in conjunction with Random Forest(RF),Support Vector Machine(SVM),and eXtreme Gradient Boosting(XGboost),and it was also compared with other methods.Results Our research extracted 21 features for the ensuing classification.In comparison to the conventional variational autoencoder,recursive feature elimination,and Lasso regression models,the IFRSVAE model attained more favorable performance across all 3 classifiers(highest AUC value,and well performed other indicators).Notably,the IFRSVAE-RF exhibited the most outstanding performance,with an AUC value reaching 85.49%(95%CI:83.24%~87.74%).Moreover,Shapley additive explanations(SHAP)interpretable model illustrated well contributions of the features in our model.Conclusion Our developed IFRSVAE shows certain effectiveness in feature extraction.The constructed IFRSVAE-RF model demonstrates relatively good performance in the task of cancer staging diagnosis,which providing a new and referable idea for research orientation of deep-learning-based diagnostic methods for cancer staging.

Result Analysis
Print
Save
E-mail