1.Application and Interpretability of the Unbalanced Ensemble Algorithm LASSO-EasyEnsemble in Prognostic Prediction of Coronary Heart Disease
Jiaxin ZAN ; Hong YANG ; Jing TIAN
Chinese Journal of Health Statistics 2025;42(2):197-203
Objective In light of the high noise and inter-class imbalance encountered in the prognosis prediction of coronary heart disease,this study aims to construct an EasyEnsemble imbalanced ensemble model after LASSO feature selection and evaluate its performance.Methods Based on survey data from the National Health and Nutrition Examination Survey public database for the years 2009-2018,with follow-up data until 2019,this study aimed to predict the prognosis of coronary heart disease based on whether there was death due to the disease as the outcome.LASSO feature selection was employed to select relevant features.Subsequently,an EasyEnsemble imbalanced ensemble prediction model,as well as SMOTE+LightGBM,XGBoost,and Random Forest prediction models,were constructed using the selected features.Grid search was performed to optimize the parameters of each model.The classification performance of the models was evaluated using metrics such as AUC,precision,specificity,G-mean,and performance curves.Additionally,SHAP analysis was applied to interpret the models'results and provide insights into their interpretability.Results The EasyEnsemble model exhibited the highest overall performance,with an AUC of 0.80(95%CI:0.79~0.82),precision of 0.86(95%CI:0.78~0.93),specificity of 0.99(95%CI:0.98~0.99),and G-mean of 0.79(95%CI:0.76~0.83),as evidenced by the performance curves.Additionally,age,serum phosphorus,diabetes,and albumin were identified as important factors influencing patient prognosis.Conclusion The LASSO- EasyEnsemble imbalanced ensemble model enables accurate prognosis prediction for coronary heart disease patients,combining SHAP can help clinicians better assess disease severity and identify at-risk groups for personalized patient management.
2.Application and Interpretability of the Unbalanced Ensemble Algorithm LASSO-EasyEnsemble in Prognostic Prediction of Coronary Heart Disease
Jiaxin ZAN ; Hong YANG ; Jing TIAN
Chinese Journal of Health Statistics 2025;42(2):197-203
Objective In light of the high noise and inter-class imbalance encountered in the prognosis prediction of coronary heart disease,this study aims to construct an EasyEnsemble imbalanced ensemble model after LASSO feature selection and evaluate its performance.Methods Based on survey data from the National Health and Nutrition Examination Survey public database for the years 2009-2018,with follow-up data until 2019,this study aimed to predict the prognosis of coronary heart disease based on whether there was death due to the disease as the outcome.LASSO feature selection was employed to select relevant features.Subsequently,an EasyEnsemble imbalanced ensemble prediction model,as well as SMOTE+LightGBM,XGBoost,and Random Forest prediction models,were constructed using the selected features.Grid search was performed to optimize the parameters of each model.The classification performance of the models was evaluated using metrics such as AUC,precision,specificity,G-mean,and performance curves.Additionally,SHAP analysis was applied to interpret the models'results and provide insights into their interpretability.Results The EasyEnsemble model exhibited the highest overall performance,with an AUC of 0.80(95%CI:0.79~0.82),precision of 0.86(95%CI:0.78~0.93),specificity of 0.99(95%CI:0.98~0.99),and G-mean of 0.79(95%CI:0.76~0.83),as evidenced by the performance curves.Additionally,age,serum phosphorus,diabetes,and albumin were identified as important factors influencing patient prognosis.Conclusion The LASSO- EasyEnsemble imbalanced ensemble model enables accurate prognosis prediction for coronary heart disease patients,combining SHAP can help clinicians better assess disease severity and identify at-risk groups for personalized patient management.

Result Analysis
Print
Save
E-mail