Construction of a prediction model for depression risk in perimenopausal women
10.3760/cma.j.cn371468-20240722-00339
- VernacularTitle:围绝经期女性抑郁风险预测模型的构建
- Author:
Dengqin WANG
1
;
Peibo SONG
;
Wanbin LI
;
Jingrui XIE
Author Information
1. 济宁医学院医学综合实训中心,济宁 272067
- Publication Type:Journal Article
- Keywords:
Perimenopausal;
Machine learning;
Predictive model;
Depression;
Random forest model
- From:
Chinese Journal of Behavioral Medicine and Brain Science
2025;34(2):151-157
- CountryChina
- Language:Chinese
-
Abstract:
Objective:To establish a machine learning-based risk prediction model for perimenopausal depressive symptoms and to identify associated risk factors.Methods:A total of 1 105 women aged 45 to 55 years were selected from the 2020 China Health and Retirement Longitudinal Study (CHARLS) dataset.Three machine learning algorithms, including Random Forest, XGBoost and Adaptive Boosting (AdaBoost), were employed to construct prediction models for perimenopausal depressive symptoms. Descriptive statistics and between-group comparisons were performed using SPSS 24.0.And Python 3.10 software was used to build the risk prediction model. Model performance was assessed using receiver operating characteristic (ROC) curves and calibration plots, and the optimal model was identified accordingly. The Shapley additive explanation (SHAP) algorithm was then used to analyze feature importance and the influence of each predictor on the outcome.Results:Among the 1 105 perimenopausal women, 671(60.7%)were categorized in the non-depressive group and 434 (39.3%) in the depressive group. The Random Forest model demonstrated the best overall predictive performance among the three machine learning models, achieving an area under the ROC curve (AUC) of 0.793 and a calibration error of 0.181. SHAP analysis revealed that annual household income was the strongest risk factor in the Random Forest model, with a relative importance of 0.048, followed by cognitive function(0.047), self-rated health status(0.046), life satisfaction(0.043), sleep duration(0.041).Conclusions:The Random Forest based model effectively predicts the risk of perimenopausal depressive symptoms. Annual household income, cognitive function, self-rated health, and life satisfaction are risk factors for depressive symptoms in perimenopausal women.