1.An advanced machine learning method for simultaneous breast cancer risk prediction and risk ranking in Chinese population: A prospective cohort and modeling study
Liyuan LIU ; Yong HE ; Chunyu KAO ; Yeye FAN ; Fu YANG ; Fei WANG ; Lixiang YU ; Fei ZHOU ; Yujuan XIANG ; Shuya HUANG ; Chao ZHENG ; Han CAI ; Heling BAO ; Liwen FANG ; Linhong WANG ; Zengjing CHEN ; Zhigang YU
Chinese Medical Journal 2024;137(17):2084-2091
Background::Breast cancer (BC) risk-stratification tools for Asian women that are highly accurate and can provide improved interpretation ability are lacking. We aimed to develop risk-stratification models to predict long- and short-term BC risk among Chinese women and to simultaneously rank potential non-experimental risk factors.Methods::The Breast Cancer Cohort Study in Chinese Women, a large ongoing prospective dynamic cohort study, includes 122,058 women aged 25-70 years old from the eastern part of China. We developed multiple machine-learning risk prediction models using parametric models (penalized logistic regression, bootstrap, and ensemble learning), which were the short-term ensemble penalized logistic regression (EPLR) risk prediction model and the ensemble penalized long-term (EPLT) risk prediction model to estimate BC risk. The models were assessed based on calibration and discrimination, and following this assessment, they were externally validated in new study participants from 2017 to 2020.Results::The AUC values of the short-term EPLR risk prediction model were 0.800 for the internal validation and 0.751 for the external validation set. For the long-term EPLT risk prediction model, the area under the receiver operating characteristic curve was 0.692 and 0.760 in internal and external validations, respectively. The net reclassification improvement index of the EPLT relative to the Gail and the Han Chinese Breast Cancer Prediction Model (HCBCP) models for external validation was 0.193 and 0.233, respectively, indicating that the EPLT model has higher classification accuracy.Conclusions::We developed the EPLR and EPLT models to screen populations with a high risk of developing BC. These can serve as useful tools to aid in risk-stratified screening and BC prevention.
2.Predictive analysis and risk assessment of Kümmell's disease in patients with osteoporotic vertebral compression fractures
Zengjing LIU ; Linghong WU ; Jiarui CHEN ; Mingbo WANG ; Xianglong ZHUO ; Xiaozhong PENG ; Xiangtao XIE
Chinese Journal of Orthopaedics 2024;44(11):756-763
Objective:To analyze predictive risk indicators associated with the development of Kümmell's disease (KD) in patients with osteoporotic vertebral compression fractures (OVCFs).Methods:A 1∶1 frequency-matched case-control study design was employed, selecting patients who visited the Department of Spine Surgery at Liuzhou Workers' Hospital from January 2021 to June 2023. Patients were divided into case and control groups based on whether they progressed to Kümmell's disease (KD). Detailed demographic information, comorbidities, and laboratory data were collected, and baseline characteristics of the two groups were compared. Initial predictive variables significantly associated with the target variable were preliminarily screened through univariate analysis. A correlation heatmap was then constructed to assess collinearity among these variables, followed by further selection of potential predictors using the Lasso regression model. Finally, a multivariable logistic regression model was used for the prediction and analysis of KD-related risk indicators.Results:Univariate analysis identified significant predictors of Kümmell's disease, including patient age, bone mineral density, kyphotic Cobb angle, and multiple vertebral fractures. These were included in the subsequent Lasso regression analysis, which identified key predictors with non-zero coefficients: age, bone density, Cobb angle, multiple vertebral fractures, platelet count (PLT), aspartate aminotransferase/alanine aminotransferase (AST/ALT), albumin (Alb), albumin/globulin ratio (Alb/Glb), alkaline phosphatase (ALP), urea (UREA), serum uric acid (SUA), fibrinogen (Fn), blood glucose (BG), and C-reactive protein (CRP). The correlation heatmap revealed the correlation and collinearity risks between these variables, with ALT and AST/ALT showing a high correlation ( r=0.750) and PLT and Alb showing a low correlation ( r=-0.110). Multivariable logistic regression indicated that the presence of multiple vertebral fractures [ OR=2.078, 95% CI (1.072, 4.025), P=0.030], increased Cobb angle [ OR=1.033, 95% CI (1.008, 1.058), P=0.009], elevated levels of ALP [ OR=1.013, 95% CI(1.004, 1.023), P=0.006], and SUA [ OR=1.004, 95% CI (1.000, 1.007), P=0.043] were associated with an increased risk of KD in patients with OVCFs. Conversely, decreased levels of Fn [ OR=0.996, 95% CI (0.992, 0.999), P=0.008] were linked to an increased risk of KD. Conclusion:Multiple vertebral fractures, increased Cobb angle, elevated levels of ALP and SUA, along with decreased levels of Fn, can be used as early-warning indicators to predict whether patients with OVCFs will develop KD. Monitoring these indicators is crucial for the early detection and intervention in these patients.