Journal of Preventive Medicine 2025;37(2):148-153
doi:10.19485/j.cnki.issn2096-5087.2025.02.009
Health literacy prediction models based on machine learning methods: a scoping review
PAN Xiang ; TONG Yingge ; LI Yixuan ; NI Ke ; CHENG Wenqian ; XIN Mengyu ; HU Yuying
Keywords
health literacy; prediction model; machine learning; scope review
Country
China
Language
Chinese
Abstract
Objective:To conduct a scoping review on the types, construction methods and predictive performance of health literacy prediction models based on machine learning methods, so as to provide the reference for the improvement and application of such models.
Methods:Publications on health literacy prediction models conducted using machine learning methods were retrieved from CNKI, Wanfang Data, VIP, PubMed and Web of Science from inception to May 1, 2024. The quality of literature was assessed using the Prediction Model Risk of Bias ASsessment Tool. Basic characteristics, modeling methods, data sources, missing value handling, predictors and predictive performance were reviewed.
Results:A total of 524 publications were retrieved, and 22 publications between 2007 and 2024 were finally enrolled. Totally 48 health literacy prediction models were involved, and 25 had a high risk of bias (52.08%), with major issues focusing on missing value handling, predictor selection and model evaluation methods. Modeling methods included regression models, tree-based machine learning methods, support vector machines and neural network models. Predictors primarily encompassed factors at four aspects: individual, interpersonal, organizational and society/policy aspects, with age, educational level, economic status, health status and internet use appearing frequently. Internal validation was conducted in 14 publications, and external validation was conducted in 4 publications. Forty-two models reported the areas under the receiver operating characteristic curve, which ranged from 0.52 to 0.983, indicating good discrimination.
Conclusion:Health literacy prediction models based on machine learning methods perform well, but have deficiencies in risk of bias, data processing and validation.
备案号: 11010502037788, 京ICP备10218182号-8)