Machine learning-based prediction model for caries in the first molars of 9-year-old children in Suzhou.
10.7518/hxkq.2025.2025135
- Author:
Lingzhi CHEN
1
;
Xiaqin WANG
1
;
Kaifei ZHU
2
;
Kun REN
3
;
Zhen WU
1
Author Information
1. Dept. of Dentistry, Suzhou Wuzhong People's Hospital, Suzhou 215128, China.
2. Suzhou Wuzhong Center for Disease Control and Prevention, Suzhou 215128, China.
3. The Key Laboratory of Engineering of Suzhou University of Science and Technology, Suzhou 215000, China.
- Publication Type:Journal Article
- Keywords:
first permanent molar;
influencing factor;
machine learning;
prediction model
- MeSH:
Humans;
Dental Caries/epidemiology*;
Child;
Machine Learning;
China/epidemiology*;
Molar;
Risk Factors;
Female;
Logistic Models;
Male;
Decision Trees;
Algorithms
- From:
West China Journal of Stomatology
2025;43(6):871-880
- CountryChina
- Language:Chinese
-
Abstract:
OBJECTIVES:This study aimed to use machine learning algorithms to build a prediction model of the first permanent molar caries of 9-year-old children in Suzhou and screen out risk factors.
METHODS:Random stratified whole group sampling was applied to randomly select 9-year-old students from 38 primary schools in 14 townships and streets in Wuzhong District for oral examination and questionnaire survey. Multifactor Logistics regression was used to analyze the risk factors of tooth decay. The data set was randomly divided into training sets and verification sets according to 8∶2, and R 4.3.1 was used to build five machine learning algorithms: random forest, decision tree, extreme gradient boosting (XGBoost), Logistics regression, and lightweight gradient enhancement (LightGBM). The predictive effect of these five models was evaluated using the area under the characteristic curve (AUC). The marginal contribution of quantitative characteristics to the caries prediction model was determined through Shapley additive explanations (SHAP).
RESULTS:This study included 7 225 samples that met the standard. The caries rate of the first permanent molar was 54.96%. Multifactor Logistic regression analysis showed that sweet drinks, dessert and candy, snack frequency, and snacks before going to bed after brushing teeth were correlated with the occurrence of first permanent molar caries (P<0.05). The AUC values of decision tree, Logistic regression, LightGBM, random forest, and XGBoost were 75.5%, 83.9%, 88.6%, 88.9%, and 90.1%, respectively. Compared with the variables after single heat coding, the SHAP value of high-frequency sweets (such as dessert candy ≥2 times a day, mother's sugary diet ≥2 times a day) and bad oral hygiene habits (such as frequent snacks before going to bed after brushing teeth and irregular brushing teeth) exhibited the highest positive.
CONCLUSIONS:XGBoost algorithm has a good prediction effect for first permanent molar caries in 9-year-old children. High-frequency sweet factors and bad oral hygiene habits have a strong positive impact on the risk of first permanent molar caries and are key drivers that can be used in the formulation of targeted interventions.