Applying decision trees to establish risk rating model of breast cancer incidence based on non-genetic factors among Southwest China females
10.3760/cma.j.issn.0253-3766.2018.11.015
- VernacularTitle: 运用决策树建立中国西南地区女性乳腺癌非遗传因素风险等级模型
- Author:
Qin LI
1
;
Sha DIAO
2
;
Hui LI
3
;
Hua HE
4
;
Jiayuan LI
2
Author Information
1. West China School of Public Health, Sichuan University, Chengdu 610041, China(Currently address: Hospital Infection Managment Section, Sichuan Maternal and Child Health Care Hospital, Sichuan Women and Children′s Hospital, Chengdu 610045, China)
2. West China School of Public Health, Sichuan University, Chengdu 610041, China
3. Department of Epidemiology and Health Statistics, Southwest Medical University, Luzhou 646000, China
4. Medical Department, Sichuan Maternal and Child Health Care Hospital, Sichuan Women and Children′s Hospital, Chengdu 610045, China
- Publication Type:Journal Article
- Keywords:
Breast neoplasms;
Decision trees;
Models, statistical
- From:
Chinese Journal of Oncology
2018;40(11):872-877
- CountryChina
- Language:Chinese
-
Abstract:
Objective:To estimate incident probability and establish risk rating model of breast cancer incidence under different combinations of non-genetic factors among Southwest China females, applying the decision trees.
Methods:From 2014 to 2015, a total of 783 cases, which were pathologically diagnosed as primary breast cancer, were sequentially collected from West China Hospital of Sichuan University, Sichuan Cancer Hospital and Sichuan Province People′s Hospital. 3, 879(excluding 36 samples with missing data) controls were randomly selected and matched by area of residence and age. Classification and regression tree (CART) algorithm was applied to construct breast cancer risk rating model according to non-genetic factors. 5 test sets were randomly selected for model validation.
Results:BI-RADS classes, menopausal status, age, history of benign breast disease, menarche age, age of first delivery and number of live births were identified as risk factors and included in the risk rating model of breast cancer incidence. Among these factors, BI-RADS classes, menopausal status and age were the most important. The risk rating model developed were vitrificated by 5 test sets, and the average sensitivity, positive predictive value, accuracy were 95.60%, 92.26%, 97.93%, respectively.
Conclusions:Breast cancer risk rating model constructed by decision trees was valid and reliable. The model could be used as the basic tool of breast cancer risk assessment among Southwest China females.