Prediction of cumulative live birth rate in in vitro fertilization using multi-model machine learning algorithms
10.3760/cma.j.cn101441-20241213-00471
- VernacularTitle:基于多模型机器学习算法预测体外受精累积活产率的研究
- Author:
Peng XING
1
;
Hui LIANG
;
Ying CHEN
;
Ting LIU
;
Jiawei ZHAI
;
Bo YUAN
;
Yingjun TIAN
Author Information
1. 保定市妇幼保健院生殖医学科,保定 071000
- Publication Type:Journal Article
- Keywords:
Fertilization in vitro;
Cumulative live birth rate;
Predictive model;
Extreme gradient boosting;
SHAP
- From:
Chinese Journal of Reproduction and Contraception
2025;45(4):358-364
- CountryChina
- Language:Chinese
-
Abstract:
Objective:To develop and validate machine learning models for predicting the cumulative live birth rate (CLBR) following in vitro fertilization (IVF) and to analyze key predictive features using SHAP values. Methods:This retrospective study included data from patients who underwent IVF-embryo transfer at the Department of Reproductive Medicine, Baoding Maternal and Child Health Hospital, between January 2017 and December 2022. Patients were categorized into two groups based on live birth outcome: the live birth group ( n=1 036) and the non-live birth group ( n=756). The dataset was randomly divided into a training set and a validation set in a ratio of 7∶3. Five algorithms were utilized for model development: logistic regression, random forest, extreme gradient boosting (XGBoost), support vector machine, and neural networks. Model performance was assessed using the area under the receiver operating characteristic (AUC) curve, F1 score, and calibration curves. Clinical decision curve analysis (DCA) was employed to evaluate the clinical utility of the models. SHAP values were used to interpret feature importance in the XGBoost model and enhance its explainability. Results:The XGBoost model demonstrated the best performance in predicting CLBR,with accuracy of 72.44%, AUC of 0.775, and F1 score of 0.654, accuracy and F1 score outperforming logistic regression (accuracy was 70.02%, F1 score was 0.585), random forest (accuracy was 71.69%, F1 score was 0.606), support vector machine (accuracy was 70.20%, F1 score was 0.607), and neural network (accuracy was 68.72%, F1 score was 0.560). The calibration curve of XGBoost closely aligned with the diagonal line, indicating that the predicted probabilities were very close to the actual outcomes, demonstrating good calibration. DCA indicated that the XGBoost model provided higher net benefits across a wide range of clinical decision thresholds. SHAP value analysis identified number of previous IVF failures, antral follicle count, anti-Müllerian hormone level, percentage of normal sperm morphology, and sperm DNA fragmentation index as key predictors of CLBR.Conclusion:The XGBoost model exhibits excellent predictive performance and calibration for CLBR, with SHAP values providing important insights into feature importance. This model has the potential to support the development of personalized treatment strategies in clinical practice. However, its generalizability needs to be validated using external datasets to ensure its applicability to diverse populations.