1.Comparison of Logistic Regression and Machine Learning Approaches in Predicting Depressive Symptoms: A National-Based Study
Xing-Xuan DONG ; Jian-Hua LIU ; Tian-Yang ZHANG ; Chen-Wei PAN ; Chun-Hua ZHAO ; Yi-Bo WU ; Dan-Dan CHEN
Psychiatry Investigation 2025;22(3):267-278
Objective:
Machine learning (ML) has been reported to have better predictive capability than traditional statistical techniques. The aim of this study was to assess the efficacy of ML algorithms and logistic regression (LR) for predicting depressive symptoms during the COVID-19 pandemic.
Methods:
Analyses were carried out in a national cross-sectional study involving 21,916 participants. The ML algorithms in this study included random forest (RF), support vector machine (SVM), neural network (NN), and gradient boosting machine (GBM) methods. The performance indices were sensitivity, specificity, accuracy, precision, F1-score, and area under the receiver operating characteristic curve (AUC).
Results:
LR and NN had the best performance in terms of AUCs. The risk of overfitting was found to be negligible for most ML models except for RF, and GBM obtained the highest sensitivity, specificity, accuracy, precision, and F1-score. Therefore, LR, NN, and GBM models ranked among the best models.
Conclusion
Compared with ML models, LR model performed comparably to ML models in predicting depressive symptoms and identifying potential risk factors while also exhibiting a lower risk of overfitting.
2.Comparison of Logistic Regression and Machine Learning Approaches in Predicting Depressive Symptoms: A National-Based Study
Xing-Xuan DONG ; Jian-Hua LIU ; Tian-Yang ZHANG ; Chen-Wei PAN ; Chun-Hua ZHAO ; Yi-Bo WU ; Dan-Dan CHEN
Psychiatry Investigation 2025;22(3):267-278
Objective:
Machine learning (ML) has been reported to have better predictive capability than traditional statistical techniques. The aim of this study was to assess the efficacy of ML algorithms and logistic regression (LR) for predicting depressive symptoms during the COVID-19 pandemic.
Methods:
Analyses were carried out in a national cross-sectional study involving 21,916 participants. The ML algorithms in this study included random forest (RF), support vector machine (SVM), neural network (NN), and gradient boosting machine (GBM) methods. The performance indices were sensitivity, specificity, accuracy, precision, F1-score, and area under the receiver operating characteristic curve (AUC).
Results:
LR and NN had the best performance in terms of AUCs. The risk of overfitting was found to be negligible for most ML models except for RF, and GBM obtained the highest sensitivity, specificity, accuracy, precision, and F1-score. Therefore, LR, NN, and GBM models ranked among the best models.
Conclusion
Compared with ML models, LR model performed comparably to ML models in predicting depressive symptoms and identifying potential risk factors while also exhibiting a lower risk of overfitting.
3.Comparison of Logistic Regression and Machine Learning Approaches in Predicting Depressive Symptoms: A National-Based Study
Xing-Xuan DONG ; Jian-Hua LIU ; Tian-Yang ZHANG ; Chen-Wei PAN ; Chun-Hua ZHAO ; Yi-Bo WU ; Dan-Dan CHEN
Psychiatry Investigation 2025;22(3):267-278
Objective:
Machine learning (ML) has been reported to have better predictive capability than traditional statistical techniques. The aim of this study was to assess the efficacy of ML algorithms and logistic regression (LR) for predicting depressive symptoms during the COVID-19 pandemic.
Methods:
Analyses were carried out in a national cross-sectional study involving 21,916 participants. The ML algorithms in this study included random forest (RF), support vector machine (SVM), neural network (NN), and gradient boosting machine (GBM) methods. The performance indices were sensitivity, specificity, accuracy, precision, F1-score, and area under the receiver operating characteristic curve (AUC).
Results:
LR and NN had the best performance in terms of AUCs. The risk of overfitting was found to be negligible for most ML models except for RF, and GBM obtained the highest sensitivity, specificity, accuracy, precision, and F1-score. Therefore, LR, NN, and GBM models ranked among the best models.
Conclusion
Compared with ML models, LR model performed comparably to ML models in predicting depressive symptoms and identifying potential risk factors while also exhibiting a lower risk of overfitting.
4.Comparison of Logistic Regression and Machine Learning Approaches in Predicting Depressive Symptoms: A National-Based Study
Xing-Xuan DONG ; Jian-Hua LIU ; Tian-Yang ZHANG ; Chen-Wei PAN ; Chun-Hua ZHAO ; Yi-Bo WU ; Dan-Dan CHEN
Psychiatry Investigation 2025;22(3):267-278
Objective:
Machine learning (ML) has been reported to have better predictive capability than traditional statistical techniques. The aim of this study was to assess the efficacy of ML algorithms and logistic regression (LR) for predicting depressive symptoms during the COVID-19 pandemic.
Methods:
Analyses were carried out in a national cross-sectional study involving 21,916 participants. The ML algorithms in this study included random forest (RF), support vector machine (SVM), neural network (NN), and gradient boosting machine (GBM) methods. The performance indices were sensitivity, specificity, accuracy, precision, F1-score, and area under the receiver operating characteristic curve (AUC).
Results:
LR and NN had the best performance in terms of AUCs. The risk of overfitting was found to be negligible for most ML models except for RF, and GBM obtained the highest sensitivity, specificity, accuracy, precision, and F1-score. Therefore, LR, NN, and GBM models ranked among the best models.
Conclusion
Compared with ML models, LR model performed comparably to ML models in predicting depressive symptoms and identifying potential risk factors while also exhibiting a lower risk of overfitting.
5.Comparison of Logistic Regression and Machine Learning Approaches in Predicting Depressive Symptoms: A National-Based Study
Xing-Xuan DONG ; Jian-Hua LIU ; Tian-Yang ZHANG ; Chen-Wei PAN ; Chun-Hua ZHAO ; Yi-Bo WU ; Dan-Dan CHEN
Psychiatry Investigation 2025;22(3):267-278
Objective:
Machine learning (ML) has been reported to have better predictive capability than traditional statistical techniques. The aim of this study was to assess the efficacy of ML algorithms and logistic regression (LR) for predicting depressive symptoms during the COVID-19 pandemic.
Methods:
Analyses were carried out in a national cross-sectional study involving 21,916 participants. The ML algorithms in this study included random forest (RF), support vector machine (SVM), neural network (NN), and gradient boosting machine (GBM) methods. The performance indices were sensitivity, specificity, accuracy, precision, F1-score, and area under the receiver operating characteristic curve (AUC).
Results:
LR and NN had the best performance in terms of AUCs. The risk of overfitting was found to be negligible for most ML models except for RF, and GBM obtained the highest sensitivity, specificity, accuracy, precision, and F1-score. Therefore, LR, NN, and GBM models ranked among the best models.
Conclusion
Compared with ML models, LR model performed comparably to ML models in predicting depressive symptoms and identifying potential risk factors while also exhibiting a lower risk of overfitting.
6.Explainable machine learning model for predicting septic shock in critically sepsis patients based on coagulation indexes: A multicenter cohort study.
Qing-Bo ZENG ; En-Lan PENG ; Ye ZHOU ; Qing-Wei LIN ; Lin-Cui ZHONG ; Long-Ping HE ; Nian-Qing ZHANG ; Jing-Chun SONG
Chinese Journal of Traumatology 2025;28(6):404-411
PURPOSE:
Septic shock is associated with high mortality and poor outcomes among sepsis patients with coagulopathy. Although traditional statistical methods or machine learning (ML) algorithms have been proposed to predict septic shock, these potential approaches have never been systematically compared. The present work aimed to develop and compare models to predict septic shock among patients with sepsis.
METHODS:
It is a retrospective cohort study based on 484 patients with sepsis who were admitted to our intensive care units between May 2018 and November 2022. Patients from the 908th Hospital of Chinese PLA Logistical Support Force and Nanchang Hongdu Hospital of Traditional Chinese Medicine were respectively allocated to training (n=311) and validation (n=173) sets. All clinical and laboratory data of sepsis patients characterized by comprehensive coagulation indexes were collected. We developed 5 models based on ML algorithms and 1 model based on a traditional statistical method to predict septic shock in the training cohort. The performance of all models was assessed using the area under the receiver operating characteristic curve and calibration plots. Decision curve analysis was used to evaluate the net benefit of the models. The validation set was applied to verify the predictive accuracy of the models. This study also used Shapley additive explanations method to assess variable importance and explain the prediction made by a ML algorithm.
RESULTS:
Among all patients, 37.2% experienced septic shock. The characteristic curves of the 6 models ranged from 0.833 to 0.962 and 0.630 to 0.744 in the training and validation sets, respectively. The model with the best prediction performance was based on the support vector machine (SVM) algorithm, which was constructed by age, tissue plasminogen activator-inhibitor complex, prothrombin time, international normalized ratio, white blood cells, and platelet counts. The SVM model showed good calibration and discrimination and a greater net benefit in decision curve analysis.
CONCLUSION
The SVM algorithm may be superior to other ML and traditional statistical algorithms for predicting septic shock. Physicians can better understand the reliability of the predictive model by Shapley additive explanations value analysis.
Humans
;
Shock, Septic/blood*
;
Machine Learning
;
Male
;
Female
;
Retrospective Studies
;
Middle Aged
;
Aged
;
Sepsis/complications*
;
ROC Curve
;
Cohort Studies
;
Adult
;
Intensive Care Units
;
Algorithms
;
Blood Coagulation
;
Critical Illness
8.Association of Body Mass Index with All-Cause Mortality and Cause-Specific Mortality in Rural China: 10-Year Follow-up of a Population-Based Multicenter Prospective Study.
Juan Juan HUANG ; Yuan Zhi DI ; Ling Yu SHEN ; Jian Guo LIANG ; Jiang DU ; Xue Fang CAO ; Wei Tao DUAN ; Ai Wei HE ; Jun LIANG ; Li Mei ZHU ; Zi Sen LIU ; Fang LIU ; Shu Min YANG ; Zu Hui XU ; Cheng CHEN ; Bin ZHANG ; Jiao Xia YAN ; Yan Chun LIANG ; Rong LIU ; Tao ZHU ; Hong Zhi LI ; Fei SHEN ; Bo Xuan FENG ; Yi Jun HE ; Zi Han LI ; Ya Qi ZHAO ; Tong Lei GUO ; Li Qiong BAI ; Wei LU ; Qi JIN ; Lei GAO ; He Nan XIN
Biomedical and Environmental Sciences 2025;38(10):1179-1193
OBJECTIVE:
This study aimed to explore the association between body mass index (BMI) and mortality based on the 10-year population-based multicenter prospective study.
METHODS:
A general population-based multicenter prospective study was conducted at four sites in rural China between 2013 and 2023. Multivariate Cox proportional hazards models and restricted cubic spline analyses were used to assess the association between BMI and mortality. Stratified analyses were performed based on the individual characteristics of the participants.
RESULTS:
Overall, 19,107 participants with a sum of 163,095 person-years were included and 1,910 participants died. The underweight (< 18.5 kg/m 2) presented an increase in all-cause mortality (adjusted hazards ratio [ aHR] = 2.00, 95% confidence interval [ CI]: 1.66-2.41), while overweight (≥ 24.0 to < 28.0 kg/m 2) and obesity (≥ 28.0 kg/m 2) presented a decrease with an aHR of 0.61 (95% CI: 0.52-0.73) and 0.51 (95% CI: 0.37-0.70), respectively. Overweight ( aHR = 0.76, 95% CI: 0.67-0.86) and mild obesity ( aHR = 0.72, 95% CI: 0.59-0.87) had a positive impact on mortality in people older than 60 years. All-cause mortality decreased rapidly until reaching a BMI of 25.7 kg/m 2 ( aHR = 0.95, 95% CI: 0.92-0.98) and increased slightly above that value, indicating a U-shaped association. The beneficial impact of being overweight on mortality was robust in most subgroups and sensitivity analyses.
CONCLUSION
This study provides additional evidence that overweight and mild obesity may be inversely related to the risk of death in individuals older than 60 years. Therefore, it is essential to consider age differences when formulating health and weight management strategies.
Humans
;
Body Mass Index
;
China/epidemiology*
;
Male
;
Female
;
Middle Aged
;
Prospective Studies
;
Rural Population/statistics & numerical data*
;
Aged
;
Follow-Up Studies
;
Adult
;
Mortality
;
Cause of Death
;
Obesity/mortality*
;
Overweight/mortality*
9.Bioequivalence study of pitavastatin calcium dispersible tablets in healthy Chinese volunteers
Wei ZHANG ; Chun-Miao PAN ; Xiao-Dan WANG ; Yin HU ; Rong SHAO ; Bo JIANG
The Chinese Journal of Clinical Pharmacology 2024;40(10):1497-1501
Objective To compare the bioavailability and bioequivalence of pivastatin calcium dispersive tablets in healthy Chinese subjects.Methods A single dose of pitavastatin calcium(2 mg)was orally administered to the test preparation or reference preparation under fasting and postprandial conditions,respectively.The plasma concentrations of pitavastatin calcium were measured at different time points before and after administration by high performance liquid chromatography-tandem mass spectrometry(HPLC-MS/MS).The bioequivalence of the two formulations was evaluated.Results Subjects received pitavastatin calcium test preparation and reference preparation in fasting condition,the Cmax were(47.79±23.99)and(46.03±21.82)ng·L-1;AUC0_,were(96.56±42.64)and(97.96±35.40)ng·h·L-1;AUC0_∞ were(102.09±43.01)and(103.46±35.62)ng·h·L-1,respectively.The 90%confidence intervals of the geometric mean ratios of Cmax,AUC0_t and AUC0-∞ of pitavastin-calcium test formulation and reference formulation were 96.28%-111.16%,94.46%-101.19%and 94.77%-101.31%,respectively.Subjects received pitavastatin calcium test preparation and reference preparation in fasting condition,the Cmax were(27.32±10.68)and(28.58±11.39)ng·L-1;AUC0_t were(82.76±27.58)and(84.06±29.12)ng·h·L-1;AUC0_∞ were(87.88±26.93)and(89.29±29.18)ng·h·L-1,respectively.The 90%confidence intervals of the geometric mean ratios of Cmax,AUC0_t and AUC0_∞ of the test formulation and the reference formulation of pitavastatin calcium were 87.39%-102.10%,94.62%-101.34%and 94.88%-101.47%,respectively.All of them were within the bioequivalence range of 80.00%to 125.00%.Conclusion Two pivastatin calcium dispersion tablets were bioequivalent and safe in healthy Chinese adult subjects.
10.A newly proposed heatstroke-induced coagulopathy score in patients with heat illness: A multicenter retrospective study in China
Qing-Wei LIN ; Lin-Cui ZHONG ; Long-Ping HE ; Qing-Bo ZENG ; Wei ZHANG ; Qing SONG ; Jing-Chun SONG
Chinese Journal of Traumatology 2024;27(2):83-90
Purpose::In patients with heatstroke, disseminated intravascular coagulation (DIC) is associated with greater risk of in-hospital mortality. However, time-consuming assays or a complex diagnostic system may delay immediate treatment. Therefore, the present study proposes a new heatstroke-induced coagulopathy (HIC) score in patients with heat illness as an early warning indicator for DIC.Methods::This retrospective study enrolled patients with heat illness in 24 Chinese hospitals from March 2021 to May 2022. Patients under 18 years old, with a congenital clotting disorder or liver disease, or using anticoagulants were excluded. Data were collected on demographic characteristics, routine blood tests, conventional coagulation assays and biochemical indexes. The risk factors related to coagulation function in heatstroke were identified by regression analysis, and used to construct a scoring system for HIC. The data of patients who met the diagnostic criteria for HIC and International Society on Thrombosis and Haemostasis defined-DIC were analyzed. All statistical analyses were performed using SPSS 26.0.Results::The final analysis included 302 patients with heat illness, of whom 131 (43.4%) suffered from heatstroke, including 7 death (5.3%). Core temperature ( OR = 1.681, 95% CI 1.291 - 2.189, p < 0.001), prothrombin time ( OR = 1.427, 95% CI 1.175 - 1.733, p < 0.001) and D-dimer ( OR = 1.242, 95% CI 1.049 - 1.471, p = 0.012) were independent risk factors for heatstroke, and therefore used to construct an HIC scoring system because of their close relation with abnormal coagulation. A total score ≥ 3 indicated HIC, and HIC scores correlated with the score for International Society of Thrombosis and Hemostasis-DIC ( r = 0.8848, p < 0.001). The incidence of HIC (27.5%) was higher than that of DIC (11.2%) in all of 131 heatstroke patients. Meanwhile, the mortality rate of HIC (19.4%) was lower than that of DIC (46.7%). When HIC developed into DIC, parameters of coagulation dysfunction changed significantly: platelet count decreased, D-dimer level rose, and prothrombin time and activated partial thromboplastin time prolonged ( p < 0.05). Conclusions::The newly proposed HIC score may provide a valuable tool for early detection of HIC and prompt initiation of treatment.

Result Analysis
Print
Save
E-mail