Construction and validation of a machine learning-based risk assessment model for primary ovarian cancer
10.13491/j.issn.1004-714X.2025.06.016
- VernacularTitle:基于机器学习的原发性卵巢癌风险评估模型构建及验证研究
- Author:
Yuanyuan ZHANG
1
;
Hongming MA
2
Author Information
1. Department of gynaecology and obstetrics, Nanjing Pukou People’s Hospital, Nanjing 211800, China.
2. Obstetrics Comprehensive ward, Nanjing Pukou People’s Hospital, Nanjing 211800, China.
- Publication Type:OriginalArticles
- Keywords:
Primary ovarian cancer;
Computed tomography feature;
Serological marker;
Machine learning
- From:
Chinese Journal of Radiological Health
2025;34(6):880-888
- CountryChina
- Language:Chinese
-
Abstract:
Objective To explore the clinical value of a machine learning model constructed based on computed tomography (CT) imaging features and serological markers for predicting the risk of primary ovarian cancer. Methods We retrospectively collected the baseline data, imaging features, and laboratory indicators of 490 patients with ovarian lesions who visited the Department of Gynecology and Obstetrics at Nanjing Pukou People’s Hospital between March 2021 and January 2025, all confirmed by histopathological examination. The patients were randomly divided into a training set and a validation set in a 7∶3 ratio. Based on the pathological results, patients in each dataset were categorized into a benign lesion group and a primary ovarian cancer group for comparison. All data from the training set patients were included in LASSO regression to screen for optimal indicators. Machine learning algorithms were established based on eight classifiers to identify the best model. Data were then incorporated into this model to screen for independent predictors. The prediction model was subsequently evaluated, visualized, and internally validated using the validation set. Results LASSO regression selected 12 variables. Among the various machine learning models, the logistic regression model showed the best performance, with the highest area under the receiver operating characteristic curve (AUC) and F1 score). Binary logistic regression model analysis indicated that venous phase CT value, CT perfusion collateral index score, CA199 level, HE4 level, tumor margin features, and lymphocyte-to-monocyte ratio were all influencing factors for primary ovarian cancer (P < 0.05). The AUC of this prediction model was 0.89, while the AUC in the validation set was 0.80, indicating favorable clinical benefit. Conclusion The machine learning model based on the integration of CT imaging features and laboratory indicators demonstrates high performance in the prediction of primary ovarian cancer risk.