1.Comparison of Logistic Regression and Machine Learning Approaches in Predicting Depressive Symptoms: A National-Based Study
Xing-Xuan DONG ; Jian-Hua LIU ; Tian-Yang ZHANG ; Chen-Wei PAN ; Chun-Hua ZHAO ; Yi-Bo WU ; Dan-Dan CHEN
Psychiatry Investigation 2025;22(3):267-278
Objective:
Machine learning (ML) has been reported to have better predictive capability than traditional statistical techniques. The aim of this study was to assess the efficacy of ML algorithms and logistic regression (LR) for predicting depressive symptoms during the COVID-19 pandemic.
Methods:
Analyses were carried out in a national cross-sectional study involving 21,916 participants. The ML algorithms in this study included random forest (RF), support vector machine (SVM), neural network (NN), and gradient boosting machine (GBM) methods. The performance indices were sensitivity, specificity, accuracy, precision, F1-score, and area under the receiver operating characteristic curve (AUC).
Results:
LR and NN had the best performance in terms of AUCs. The risk of overfitting was found to be negligible for most ML models except for RF, and GBM obtained the highest sensitivity, specificity, accuracy, precision, and F1-score. Therefore, LR, NN, and GBM models ranked among the best models.
Conclusion
Compared with ML models, LR model performed comparably to ML models in predicting depressive symptoms and identifying potential risk factors while also exhibiting a lower risk of overfitting.
2.Comparison of Logistic Regression and Machine Learning Approaches in Predicting Depressive Symptoms: A National-Based Study
Xing-Xuan DONG ; Jian-Hua LIU ; Tian-Yang ZHANG ; Chen-Wei PAN ; Chun-Hua ZHAO ; Yi-Bo WU ; Dan-Dan CHEN
Psychiatry Investigation 2025;22(3):267-278
Objective:
Machine learning (ML) has been reported to have better predictive capability than traditional statistical techniques. The aim of this study was to assess the efficacy of ML algorithms and logistic regression (LR) for predicting depressive symptoms during the COVID-19 pandemic.
Methods:
Analyses were carried out in a national cross-sectional study involving 21,916 participants. The ML algorithms in this study included random forest (RF), support vector machine (SVM), neural network (NN), and gradient boosting machine (GBM) methods. The performance indices were sensitivity, specificity, accuracy, precision, F1-score, and area under the receiver operating characteristic curve (AUC).
Results:
LR and NN had the best performance in terms of AUCs. The risk of overfitting was found to be negligible for most ML models except for RF, and GBM obtained the highest sensitivity, specificity, accuracy, precision, and F1-score. Therefore, LR, NN, and GBM models ranked among the best models.
Conclusion
Compared with ML models, LR model performed comparably to ML models in predicting depressive symptoms and identifying potential risk factors while also exhibiting a lower risk of overfitting.
3.Comparison of Logistic Regression and Machine Learning Approaches in Predicting Depressive Symptoms: A National-Based Study
Xing-Xuan DONG ; Jian-Hua LIU ; Tian-Yang ZHANG ; Chen-Wei PAN ; Chun-Hua ZHAO ; Yi-Bo WU ; Dan-Dan CHEN
Psychiatry Investigation 2025;22(3):267-278
Objective:
Machine learning (ML) has been reported to have better predictive capability than traditional statistical techniques. The aim of this study was to assess the efficacy of ML algorithms and logistic regression (LR) for predicting depressive symptoms during the COVID-19 pandemic.
Methods:
Analyses were carried out in a national cross-sectional study involving 21,916 participants. The ML algorithms in this study included random forest (RF), support vector machine (SVM), neural network (NN), and gradient boosting machine (GBM) methods. The performance indices were sensitivity, specificity, accuracy, precision, F1-score, and area under the receiver operating characteristic curve (AUC).
Results:
LR and NN had the best performance in terms of AUCs. The risk of overfitting was found to be negligible for most ML models except for RF, and GBM obtained the highest sensitivity, specificity, accuracy, precision, and F1-score. Therefore, LR, NN, and GBM models ranked among the best models.
Conclusion
Compared with ML models, LR model performed comparably to ML models in predicting depressive symptoms and identifying potential risk factors while also exhibiting a lower risk of overfitting.
4.Comparison of Logistic Regression and Machine Learning Approaches in Predicting Depressive Symptoms: A National-Based Study
Xing-Xuan DONG ; Jian-Hua LIU ; Tian-Yang ZHANG ; Chen-Wei PAN ; Chun-Hua ZHAO ; Yi-Bo WU ; Dan-Dan CHEN
Psychiatry Investigation 2025;22(3):267-278
Objective:
Machine learning (ML) has been reported to have better predictive capability than traditional statistical techniques. The aim of this study was to assess the efficacy of ML algorithms and logistic regression (LR) for predicting depressive symptoms during the COVID-19 pandemic.
Methods:
Analyses were carried out in a national cross-sectional study involving 21,916 participants. The ML algorithms in this study included random forest (RF), support vector machine (SVM), neural network (NN), and gradient boosting machine (GBM) methods. The performance indices were sensitivity, specificity, accuracy, precision, F1-score, and area under the receiver operating characteristic curve (AUC).
Results:
LR and NN had the best performance in terms of AUCs. The risk of overfitting was found to be negligible for most ML models except for RF, and GBM obtained the highest sensitivity, specificity, accuracy, precision, and F1-score. Therefore, LR, NN, and GBM models ranked among the best models.
Conclusion
Compared with ML models, LR model performed comparably to ML models in predicting depressive symptoms and identifying potential risk factors while also exhibiting a lower risk of overfitting.
5.Comparison of Logistic Regression and Machine Learning Approaches in Predicting Depressive Symptoms: A National-Based Study
Xing-Xuan DONG ; Jian-Hua LIU ; Tian-Yang ZHANG ; Chen-Wei PAN ; Chun-Hua ZHAO ; Yi-Bo WU ; Dan-Dan CHEN
Psychiatry Investigation 2025;22(3):267-278
Objective:
Machine learning (ML) has been reported to have better predictive capability than traditional statistical techniques. The aim of this study was to assess the efficacy of ML algorithms and logistic regression (LR) for predicting depressive symptoms during the COVID-19 pandemic.
Methods:
Analyses were carried out in a national cross-sectional study involving 21,916 participants. The ML algorithms in this study included random forest (RF), support vector machine (SVM), neural network (NN), and gradient boosting machine (GBM) methods. The performance indices were sensitivity, specificity, accuracy, precision, F1-score, and area under the receiver operating characteristic curve (AUC).
Results:
LR and NN had the best performance in terms of AUCs. The risk of overfitting was found to be negligible for most ML models except for RF, and GBM obtained the highest sensitivity, specificity, accuracy, precision, and F1-score. Therefore, LR, NN, and GBM models ranked among the best models.
Conclusion
Compared with ML models, LR model performed comparably to ML models in predicting depressive symptoms and identifying potential risk factors while also exhibiting a lower risk of overfitting.
6.Clinical course, causes of worsening, and outcomes of severe ischemic stroke: A prospective multicenter cohort study.
Simiao WU ; Yanan WANG ; Ruozhen YUAN ; Meng LIU ; Xing HUA ; Linrui HUANG ; Fuqiang GUO ; Dongdong YANG ; Zuoxiao LI ; Bihua WU ; Chun WANG ; Jingfeng DUAN ; Tianjin LING ; Hao ZHANG ; Shihong ZHANG ; Bo WU ; Cairong ZHU ; Craig S ANDERSON ; Ming LIU
Chinese Medical Journal 2025;138(13):1578-1586
BACKGROUND:
Severe stroke has high rates of mortality and morbidity. This study aimed to investigate the clinical course, causes of worsening, and outcomes of severe ischemic stroke.
METHODS:
This prospective, multicenter cohort study enrolled adult patients admitted ≤30 days after ischemic stroke from nine hospitals in China between September 2017 and December 2019. Severe stroke was defined as a score of ≥15 on the National Institutes of Health Stroke Scale (NIHSS). Clinical worsening was defined as an increase of 4 in the NIHSS score from baseline. Unfavorable functional outcome was defined as a modified Rankin scale score ≥3 at 3 months and 1 year after stroke onset, respectively. We performed Logistic regression to explore baseline features and reperfusion therapies associated with clinical worsening and functional outcomes.
RESULTS:
Among 4201 patients enrolled, 854 patients (20.33%) had severe stroke on admission. Of 3347 patients without severe stroke on admission, 142 (4.24%) patients developed severe stroke in hospital. Of 854 patients with severe stroke on admission, 33.95% (290/854) experienced clinical worsening (median time from stroke onset: 43 h, Q1-Q3: 20-88 h), with brain edema (54.83% [159/290]) as the leading cause; 24.59% (210/854) of these patients died by 30 days, and 81.47% (677/831) and 78.44% (633/807) had unfavorable functional outcomes at 3 months and 1 year respectively. Reperfusion reduced the risk of worsening (adjusted odds ratio [OR]: 0.24, 95% confidence interval [CI]: 0.12-0.49, P <0.01), 30-day death (adjusted OR: 0.22, 95% CI: 0.11-0.41, P <0.01), and unfavorable functional outcomes at 3 months (adjusted OR: 0.24, 95% CI: 0.08-0.68, P <0.01) and 1 year (adjusted OR: 0.17, 95% CI: 0.06-0.50, P <0.01).
CONCLUSIONS:
Approximately one-fifth of patients with ischemic stroke had severe neurological deficits on admission. Clinical worsening mainly occurred in the first 3 to 4 days after stroke onset, with brain edema as the leading cause of worsening. Reperfusion reduced the risk of clinical worsening and improved functional outcomes.
REGISTRATION
ClinicalTrials.gov , NCT03222024.
Humans
;
Male
;
Female
;
Prospective Studies
;
Ischemic Stroke/mortality*
;
Aged
;
Middle Aged
;
Aged, 80 and over
;
Stroke
;
Brain Ischemia
7.Explainable machine learning model for predicting septic shock in critically sepsis patients based on coagulation indexes: A multicenter cohort study.
Qing-Bo ZENG ; En-Lan PENG ; Ye ZHOU ; Qing-Wei LIN ; Lin-Cui ZHONG ; Long-Ping HE ; Nian-Qing ZHANG ; Jing-Chun SONG
Chinese Journal of Traumatology 2025;28(6):404-411
PURPOSE:
Septic shock is associated with high mortality and poor outcomes among sepsis patients with coagulopathy. Although traditional statistical methods or machine learning (ML) algorithms have been proposed to predict septic shock, these potential approaches have never been systematically compared. The present work aimed to develop and compare models to predict septic shock among patients with sepsis.
METHODS:
It is a retrospective cohort study based on 484 patients with sepsis who were admitted to our intensive care units between May 2018 and November 2022. Patients from the 908th Hospital of Chinese PLA Logistical Support Force and Nanchang Hongdu Hospital of Traditional Chinese Medicine were respectively allocated to training (n=311) and validation (n=173) sets. All clinical and laboratory data of sepsis patients characterized by comprehensive coagulation indexes were collected. We developed 5 models based on ML algorithms and 1 model based on a traditional statistical method to predict septic shock in the training cohort. The performance of all models was assessed using the area under the receiver operating characteristic curve and calibration plots. Decision curve analysis was used to evaluate the net benefit of the models. The validation set was applied to verify the predictive accuracy of the models. This study also used Shapley additive explanations method to assess variable importance and explain the prediction made by a ML algorithm.
RESULTS:
Among all patients, 37.2% experienced septic shock. The characteristic curves of the 6 models ranged from 0.833 to 0.962 and 0.630 to 0.744 in the training and validation sets, respectively. The model with the best prediction performance was based on the support vector machine (SVM) algorithm, which was constructed by age, tissue plasminogen activator-inhibitor complex, prothrombin time, international normalized ratio, white blood cells, and platelet counts. The SVM model showed good calibration and discrimination and a greater net benefit in decision curve analysis.
CONCLUSION
The SVM algorithm may be superior to other ML and traditional statistical algorithms for predicting septic shock. Physicians can better understand the reliability of the predictive model by Shapley additive explanations value analysis.
Humans
;
Shock, Septic/blood*
;
Machine Learning
;
Male
;
Female
;
Retrospective Studies
;
Middle Aged
;
Aged
;
Sepsis/complications*
;
ROC Curve
;
Cohort Studies
;
Adult
;
Intensive Care Units
;
Algorithms
;
Blood Coagulation
;
Critical Illness
8.Characteristics of Gut Microbiota Changes and Their Relationship with Infectious Complications During Induction Chemotherapy in AML Patients.
Quan-Lei ZHANG ; Li-Li DONG ; Lin-Lin ZHANG ; Yu-Juan WU ; Meng LI ; Jian BO ; Li-Li WANG ; Yu JING ; Li-Ping DOU ; Dai-Hong LIU ; Zhen-Yang GU ; Chun-Ji GAO
Journal of Experimental Hematology 2025;33(3):738-744
OBJECTIVE:
To investigate the characteristics of gut microbiota changes in patients with acute myeloid leukemia (AML) undergoing induction chemotherapy and to explore the relationship between infectious complications and gut microbiota.
METHODS:
Fecal samples were collected from 37 newly diagnosed AML patients at four time points: before induction chemotherapy, during chemotherapy, during the neutropenic phase, and during the recovery phase. Metagenomic sequencing was used to analyze the dynamic changes in gut microbiota. Correlation analyses were conducted to assess the relationship between changes in gut microbiota and the occurrence of infectious complications.
RESULTS:
During chemotherapy, the gut microbiota α-diversity (Shannon index) of AML patients exhibited significant fluctuations. Specifically, the diversity decreased significantly during induction chemotherapy, further declined during the neutropenic phase (P < 0.05, compared to baseline), and gradually recovered during the recovery phase, though not fully returning to baseline levels.The abundances of beneficial bacteria, such as Firmicutes and Bacteroidetes, gradually decreased during chemotherapy, whereas the abundances of opportunistic pathogens, including Enterococcus, Klebsiella, and Escherichia coli, progressively increased.Analysis of the dynamic changes in gut microbiota of seven patients with bloodstream infections revealed that the bloodstream infection pathogens could be detected in the gut microbiota of the corresponding patients, with their abundance gradually increasing during the course of infection. This finding suggests that bloodstream infections may be associated with opportunistic pathogens originating from the gut microbiota.Compared to non-infected patients, the baseline samples of infected patients showed a significantly lower relative abundance of Bacteroidetes (P < 0.05). Regression analysis indicated that Bacteroidetes abundance is an independent predictive factor for infectious complications (P < 0.05, OR =13.143).
CONCLUSION
During induction chemotherapy in AML patients, gut microbiota α-diversity fluctuates significantly, and the abundance of opportunistic pathogens increase, which may be associated with bloodstream infections. Patients with lower baseline Bacteroidetes abundance are more prone to infections, and its abundance can serve as an independent predictor of infectious complications.
Humans
;
Gastrointestinal Microbiome
;
Leukemia, Myeloid, Acute/microbiology*
;
Induction Chemotherapy
;
Feces/microbiology*
;
Male
;
Female
;
Middle Aged
9.Effects of Hot Night Exposure on Human Semen Quality: A Multicenter Population-Based Study.
Ting Ting DAI ; Ting XU ; Qi Ling WANG ; Hao Bo NI ; Chun Ying SONG ; Yu Shan LI ; Fu Ping LI ; Tian Qing MENG ; Hui Qiang SHENG ; Ling Xi WANG ; Xiao Yan CAI ; Li Na XIAO ; Xiao Lin YU ; Qing Hui ZENG ; Pi GUO ; Xin Zong ZHANG
Biomedical and Environmental Sciences 2025;38(2):178-193
OBJECTIVE:
To explore and quantify the association of hot night exposure during the sperm development period (0-90 lag days) with semen quality.
METHODS:
A total of 6,640 male sperm donors from 6 human sperm banks in China during 2014-2020 were recruited in this multicenter study. Two indices (i.e., hot night excess [HNE] and hot night duration [HND]) were used to estimate the heat intensity and duration during nighttime. Linear mixed models were used to examine the association between hot nights and semen quality parameters.
RESULTS:
The exposure-response relationship revealed that HNE and HND during 0-90 days before semen collection had a significantly inverse association with sperm motility. Specifically, a 1 °C increase in HNE was associated with decreased sperm progressive motility of 0.0090 (95% confidence interval [ CI]: -0.0147, -0.0033) and decreased total motility of 0.0094 (95% CI: -0.0160, -0.0029). HND was significantly associated with reduced sperm progressive motility and total motility of 0.0021 (95% CI: -0.0040, -0.0003) and 0.0023 (95% CI: -0.0043, -0.0002), respectively. Consistent results were observed at different temperature thresholds on hot nights.
CONCLUSION
Our findings highlight the need to mitigate nocturnal heat exposure during spermatogenesis to maintain optimal semen quality.
Humans
;
Male
;
Semen Analysis
;
Adult
;
Sperm Motility
;
Hot Temperature/adverse effects*
;
China
;
Middle Aged
;
Spermatozoa/physiology*
;
Young Adult
10.Association of Body Mass Index with All-Cause Mortality and Cause-Specific Mortality in Rural China: 10-Year Follow-up of a Population-Based Multicenter Prospective Study.
Juan Juan HUANG ; Yuan Zhi DI ; Ling Yu SHEN ; Jian Guo LIANG ; Jiang DU ; Xue Fang CAO ; Wei Tao DUAN ; Ai Wei HE ; Jun LIANG ; Li Mei ZHU ; Zi Sen LIU ; Fang LIU ; Shu Min YANG ; Zu Hui XU ; Cheng CHEN ; Bin ZHANG ; Jiao Xia YAN ; Yan Chun LIANG ; Rong LIU ; Tao ZHU ; Hong Zhi LI ; Fei SHEN ; Bo Xuan FENG ; Yi Jun HE ; Zi Han LI ; Ya Qi ZHAO ; Tong Lei GUO ; Li Qiong BAI ; Wei LU ; Qi JIN ; Lei GAO ; He Nan XIN
Biomedical and Environmental Sciences 2025;38(10):1179-1193
OBJECTIVE:
This study aimed to explore the association between body mass index (BMI) and mortality based on the 10-year population-based multicenter prospective study.
METHODS:
A general population-based multicenter prospective study was conducted at four sites in rural China between 2013 and 2023. Multivariate Cox proportional hazards models and restricted cubic spline analyses were used to assess the association between BMI and mortality. Stratified analyses were performed based on the individual characteristics of the participants.
RESULTS:
Overall, 19,107 participants with a sum of 163,095 person-years were included and 1,910 participants died. The underweight (< 18.5 kg/m 2) presented an increase in all-cause mortality (adjusted hazards ratio [ aHR] = 2.00, 95% confidence interval [ CI]: 1.66-2.41), while overweight (≥ 24.0 to < 28.0 kg/m 2) and obesity (≥ 28.0 kg/m 2) presented a decrease with an aHR of 0.61 (95% CI: 0.52-0.73) and 0.51 (95% CI: 0.37-0.70), respectively. Overweight ( aHR = 0.76, 95% CI: 0.67-0.86) and mild obesity ( aHR = 0.72, 95% CI: 0.59-0.87) had a positive impact on mortality in people older than 60 years. All-cause mortality decreased rapidly until reaching a BMI of 25.7 kg/m 2 ( aHR = 0.95, 95% CI: 0.92-0.98) and increased slightly above that value, indicating a U-shaped association. The beneficial impact of being overweight on mortality was robust in most subgroups and sensitivity analyses.
CONCLUSION
This study provides additional evidence that overweight and mild obesity may be inversely related to the risk of death in individuals older than 60 years. Therefore, it is essential to consider age differences when formulating health and weight management strategies.
Humans
;
Body Mass Index
;
China/epidemiology*
;
Male
;
Female
;
Middle Aged
;
Prospective Studies
;
Rural Population/statistics & numerical data*
;
Aged
;
Follow-Up Studies
;
Adult
;
Mortality
;
Cause of Death
;
Obesity/mortality*
;
Overweight/mortality*

Result Analysis
Print
Save
E-mail