Risk model of breast cancer prognosis based on the expression profile of long non-coding RNA
10.3760/cma.j.cn121361-20190112-00010
- VernacularTitle:基于长链非编码RNA的表达谱特征构建乳腺癌患者预后的风险模型
- Author:
Jinsong WANG
1
;
Chunxiao LI
;
Ting WANG
;
Jingyao ZHANG
;
Yantong ZHOU
;
Fangzhou SUN
;
Mengjiao CHANG
;
Fei MA
;
Haijuan WANG
;
Haili QIAN
Author Information
1. 国家癌症中心 国家肿瘤临床医学研究中心 中国医学科学院北京协和医学院肿瘤医院分子肿瘤学国家重点实验室 100021
- From:
Clinical Medicine of China
2020;36(3):217-222
- CountryChina
- Language:Chinese
-
Abstract:
Objective:To construct a prediction model for the prognosis of breast cancer patients with long non-coding RNA expression characteristics.Methods:To construct a long non-coding RNA(LncRNA) model for predicting the prognosis of breast cancer patients.Methods Analyzing LncRNA expression profiles and clinical characteristics of 1 081 breast cancer patients in the cancer genome atlas (TCGA) database.Performing differential expression analysis and univariate analysis on 112 paired breast cancer and normal breast tissues′ transcriptome sequencing data in the TCGA database, and screened for differentially expressed (DELncRNAs) that significantly correlated with the prognosis of BRCA (To reduce batch effects, sequencing data has been normalized using the DESeq function). One thousand eighty-one breast cancer patients were randomly divided into two groups: training set (541) and validation set (540). Performing Cox proportional hazard regression using DELncRNAs and establishing a multi-LncRNA prognosis model in the training set, followed by proportional hazards assumption test(PH assumption test). Patients were divided into high-risk and low-risk groups based on calculated risk score.Kaplan-Meier method was used for survival analysis, and 540 patients′ data were used for validation.To evaluate the prognostic value of the model in patients with squamous cell carcinoma of the lung and hepatocarcinoma in TCGA database.Gene Set Enrichment Analysis (GSEA) was used to analyze the specific mechanism of lncrna affecting the survival of patients.Results:There were 2815 differentially expressed genes screened by transcriptome sequencing, 91 of which were significantly related to the prognosis of breast cancer patients ( P<0.05). Based on the Cox regression analysis of 91 delncrna expression data from 541 breast cancer patients in training set, a Cox proportional risk regression model was constructed based on 5 LncRNA (training set AUC=0.746, validation set AUC=0.650): AC004551.1, MTOR-AS1, KCNAB1-AS2, FAM230G and LINC01283, and PH assumption test( P=0.388). K-M survival analysis showed that the survival time of high-risk group was significantly worse than that of low-risk group (median survival time: 7.049 and 12.21 years, HR 0.367, 95% CI0.228-0.597, P<0.001), and the survival time of high-risk group was significantly shorter than that of low-risk group (median survival time: 7.57 and 10.85 years, HR 0.412, 95% CI0.214-0.793, P<0.001). Similar prediction results were also obtained in other cancer species of TCGA: lung squamous cell carcinoma ( HR 0.604, 95% CI0.383-0.951, P=0.007) and liver cell carcinoma ( HR 0.551, 95% CI0.307-0.987, P=0.011). GSEA results suggested that the expression patterns of the above five LncRNA were related to the cell cycle regulation of tumor cells. Conclusion:The prognostic model constructed based on expression profile of AC004551.1, MTOR-AS1, KCNAB1-AS2, FAM230G and LINC01283 can be used to predict the prognosis of breast cancer patients, which is helpful to further guide clinical treatment.