Key gene screening and prediction model construction of gastric cancer based on machine learning
10.3969/j.issn.1005-202X.2024.01.017
- VernacularTitle:基于机器学习的胃癌关键基因筛选及预测模型构建
- Author:
Zepeng WANG
1
;
Kunpeng LI
;
Yu ZHOU
;
Sihai LI
Author Information
1. 甘肃中医药大学信息工程学院,甘肃兰州 730100
- Keywords:
gastric cancer;
gene screening;
key gene;
bioinformatics;
machine learning
- From:
Chinese Journal of Medical Physics
2024;41(1):115-124
- CountryChina
- Language:Chinese
-
Abstract:
Objective To verify the genetic characteristics associated with gastric cancer,and to propose a hybrid feature selection method for identifying target genes,further analyzing their significance and establishing a new diagnostic prediction model.Methods Analysis of variance in bioinformatics was performed on the original gastric cancer data,and then machine learning methods such as random forest,recursive feature elimination of support vector machine,and LASSO algorithm were used to screen gastric cancer associated genes,and the intersection of results was taken as the key gene set.The key genes were identified and verified through enrichment analysis.The diagnosis and prediction models based on 8 kinds of machine learning classification algorithms such as multi-layer perceptron,logistic regression and decision tree,were constructed using the key genes.Results The key genes selected by the hybrid feature selection method were closely related to the tumorigenesis and development.Eight key genes(TXNDC5,BMP8A,ONECUT2,COL10A1,JCHAIN,INHBA,LCTL and TRIM59)were identified as potential markers of good diagnostic efficacy in gastric cancer.The ROC curve and accuracy results demonstrated that among the 8 classification models,MLP is the best gastric cancer prediction model,with an accuracy of 97.77%,which was 3.83%higher than that of Xgboost gastric cancer prediction model.Conclusion The study identifies 8 key genes for the diagnosis and prevention of gastric cancer,and establishes the optimal prognosis model.