Constructing non-small cell lung cancer survival prediction model based on Borderline-SMOTE and PFS
10.3760/cma.j.issn.1673-4181.2019.04.011
- VernacularTitle:基于Borderline-SMOTE和PFS构建非小细胞肺癌生存预测模型
- Author:
Yang ZHAO
1
;
Xiaojie WANG
;
Lei MA
;
Dangguo SHAO
;
Yan XIANG
;
Xin XIONG
;
Li ZHANG
Author Information
1. 昆明理工大学信息工程与自动化学院 650500
- Keywords:
Non-small cell lung cancer;
Imbalance;
Feature selection;
Survival prediction
- From:
International Journal of Biomedical Engineering
2019;42(4):336-341
- CountryChina
- Language:Chinese
-
Abstract:
Objective To predict the 5-year survival of patients with non-small cell lung cancer (NSCLC) by machine learning, and to improve the prediction efficiency and prediction accuracy. Methods The experiments were performed using NSCLC data from the SEER database. According to the imbalance of patient data, the Borderline-SMOTE method was used for data sampling. The perturbation-based feature selection (PFS) method and decision tree ( DT ) algorithm were used to screen the features and construct the postoperative survival prediction model . Results The patient data was balanced, and seven prognostic variables were screened, including primary site, stage group, surgical primary site, international classification of diseases, race and grade. Compared with LASSO, Tree-based, PFS-SVM and PFS-kNN models, the model constructed using PFS-DT has the best predictive effect. Conclusions The patient survival prediction model based on PFS-DT can effectively improve the accuracy of postoperative survival prediction in patients with NSCLC, and can provide a reference for doctors to provide treatment and improve prognosis.