Application of multiple support vector machine recursive feature elimination model in cancer feature gene selection
10.3760/cma.j.issn.1673-4181.2019.01.006
- VernacularTitle:多重支持向量机递归特征消除模型在癌症特征基因筛选中的应用
- Author:
Wenbin XU
1
;
Hong XIA
;
Weiying ZHENG
;
Lin HUA
Author Information
1. 首都医科大学生物医学工程学院
- Keywords:
Gene expression profile;
Recursive feature elimination;
Support vector machine;
Feature gene
- From:
International Journal of Biomedical Engineering
2019;42(1):33-38
- CountryChina
- Language:Chinese
-
Abstract:
Objective To analyze the cancergene expression profile data using multi-support vector machine recursive feature elimination algorithm (MSVM-RFE) and calculate the genetic ranking score to obtain the optimal feature gene subset. Methods Gene expression profiles of bladder cancer, breast cancer, colon cancer and lung cancer were downloaded from GEO (Gene Expression Omnibus) database.The differentially expressed genes were obtained by differential expression analysis. The differential gene expressions were sequenced by MSVM-RFE algorithm and the average test errors of each gene subset were calculated. Then the optimal gene subsetsof four kinds of cancer were obtained according to the minimum average test errors. Based on the datasets of four kinds of cancer characteristic genes before and after screening, linear SVM classifiers were constructed and the classification efficiencies of the optimal feature gene subsets were verified. Results Using the optimal feature gene subsetobtained by MSVM-RFE algorithm, the classification accuracy was improved from (96.77±1.28)%to (99.85±0.46)%for the bladder cancer data, improved from (83.77±4.93)%to (88.30±3.85)%for the breast cancer data, and improved from (72.69±2.41)%to (90.21±3.31)%for the lung cancer data.Besides, theoptimal feature gene subsetkept the classification accuracy of colon cancer classifierat a high level (>99.5%). Conclusions The feature gene extraction based on MSVM-RFE algorithm can improve the classification efficiency of cancer.