Screening of characteristic genes in early-onset pre-eclampsia and analysis of their association with immune cell infiltration based on bioinformatics analysis and machine-learning algorithms
10.3760/cma.j.cn113903-20230730-00060
- VernacularTitle:基于生物信息学分析及机器学习方法筛选早发型子痫前期的特征基因及相关免疫细胞浸润分析
- Author:
Zitong WU
1
;
Yuanyuan ZHENG
;
Xin DING
Author Information
1. 首都医科大学附属北京妇产医院(北京妇幼保健院)产科,北京 100026
- Keywords:
Pre-eclampsia;
Computational biology;
Machine learning;
Gene expression;
Cellular microenvironment;
Macrophages
- From:
Chinese Journal of Perinatal Medicine
2024;27(1):51-61
- CountryChina
- Language:Chinese
-
Abstract:
Objective:To screen the characteristic genes of early-onset pre-eclampsia (EOSP) and to analyze their association with immune cell infiltration based on bioinformatics analysis and machine learning methods.Methods:In the Gene Expression Omnibus (GEO) database, the mRNA sequences of placental tissues from women with EOSP and normal pregnancy were retrieved using the term "early-onset pre-eclampsia". The R language was used for background correction, standardization, summarization, and probe quality control. Annotation packages were downloaded for ID conversion and the expression matrices were extracted. The differentially expressed genes (DEGs) between the EOSP and the normal pregnancy in the metadata were analyzed after correcting for batch effects using the limma package. Characteristic genes were identified through the support vector machine (SVM) -recursive feature elimination (RFE) method and the LASSO regression model. The area under the curve (AUC) was calculated to judge the diagnostic efficiency of the characteristic genes. Placental tissues were retrospectively collected for verification from 15 patients with EOSP and 15 with normal pregnancy who were delivered at Beijing Obstetrics and Gynecology Hospital, Capital Medical University from January 1, 2022, to February 28, 2023. The expression of characteristic genes was verified using quantitative real-time polymerase chain reaction (qRT-PCR) and Western blot, which were further validated in the validation dataset. Finally, the CIBERSORT algorithm was used to analyze the relative proportion of infiltrating immune cell in EOSP. A t-test was used for differential analysis. Results:Three gene datasets were downloaded, including GSE44711 (eight cases each for EOSP and normal pregnancy), GSE74341 (seven cases for EOSP and five cases for normal pregnancy), and GSE190639 (13 cases each for EOSP and normal pregnancy). A total of 29 DEGs were screened after combining the GSE44711 and GSE74341 datasets, including 27 upregulated and two downregulated genes. Gene ontology enrichment analysis showed that these genes are mainly involved in the secretion of gonadotropins, female pregnancy, regulation of endocrine processes, secretion of endocrine hormones, and negative regulation of hormone secretion. Eight characteristic genes ( EBI3, HTRA4, TREML2, TREM1, NTRK2, ANKRD37, CST6, and ARMS2) were screened using the LASSO regression algorithm combined with SVM-RFE algorithm and the expression differences of these characteristic genes were verified as statistically significant by qRT-PCR and Western blot (all P<0.05, except for CST6). Logistic regression algorithm showed that the AUC (95% CI) of TREML2, ANKRD37, NTRK2, TREM1, HTRA4, EBI3, and ARMS2 were 0.979 (0.918-1.000), 0.969 (0.897-1.000), 0.969 (0.892-1.000), 0.979 (0.918-1.000), 0.990 (0.954-1.000), 0.990 (0.954-1.000), and 0.903 (0.764-1.000). Immune cell infiltration analysis indicated that the infiltration ratio of M2 macrophages in the placental tissue from EOSP was significantly lower than that in the normal pregnancy (0.167±0.074 vs. 0.462±0.091, P=0.002), but the infiltration ratios of monocytes and eosinophils were significantly higher (0.201±0.004 vs. 0.085±0.006; 0.031±0.001 vs. 0.001±0.000, both P<0.05). The correlation analysis between characteristic genes and infiltrating immune cells found that the seven characteristic genes were closely related to the immune cells (all P<0.05). Conclusion:Seven characteristic genes that are critical for the prediction and early diagnosis of EOSP are screened using bioinformatics analysis and machine-learning algorithms in this study, which provides new research targets and a basis for the prevention and treatment of preeclampsia in the future.