1.Emotional time-based detection of patients with bipolar disorder based on deep learning speech analysis
Zhiying LI ; Jun JI ; Shuzhe ZHOU ; Jiaqi LI ; Xinhui LI ; Chaonan FENG ; Lili GUAN ; Zaohui MA ; Yantao MA
Chinese Journal of Psychiatry 2024;57(4):207-212
Objective:To utilize a deep learning approach based on speech to distinguish between depressive and manic mood states in patients with bipolar disorder (BD).Methods:Sixty-one BD patients who visited the outpatient department of psychiatry at Peking University Sixth Hospital were recruited to participate in the study from June 2018 to March 2022. Quick Inventory of Depressive Symptomatology, Mood Disorder Questionnaire and Young Mania Rating Scale were used to determine patients′ mood states. The voices of the patients were recorded, including 190 samples during the patient′s remission, depressive, and manic mood period respectively. A total of 136 features were extracted from the voice samples, including Mel-frequency cepstral coefficients and zero-crossing rates using the speech analysis library in Python. A LIGHT-SERNET-based network was then used to train a model for emotion classification. Accuracy is used to evaluate the performance of the model, using sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and receiver operating characteristic curve (ROC) to evaluate the predictive results of model for three mood states. Kruskal-Wallis H tests or χ 2 tests were conducted to compare the differences among the demographic information of three groups. Results:There were statistically significant differences among the three groups in age ( H=25.83, P<0.001), years of education ( H=25.25, P<0.001) and marital status (χ 2=23.81, P<0.001). There is no significant difference in gender (χ 2=4.63, P=0.099). The accuracy of the model in detecting the three emotional states was 0.84. The sensitivity and specificity in detecting remission were 0.88 and 0.93, respectively, and the positive predictive value and negative predictive value were 0.87 and 0.94, respectively. The sensitivity and specificity in detecting depressive episodes were 0.82 and 0.92, respectively, and the positive predictive value and negative predictive value were 0.84 and 0.92, respectively. The sensitivity and specificity in detecting manic episodes were 0.82 and 0.91, respectively, and the positive predictive value and negative predictive value were 0.83 and 0.91, respectively. The areas of the receiver operation characteristic curve for the three mood states were similar and all exceeded 0.90. Conclusion:The LIGHT-SERNET-based deep learning model shows good discrimination ability between depressive and manic mood states based on speech analysis.
2.Emotional time-based detection of patients with bipolar disorder based on deep learning speech analysis
Zhiying LI ; Jun JI ; Shuzhe ZHOU ; Jiaqi LI ; Xinhui LI ; Chaonan FENG ; Lili GUAN ; Zaohui MA ; Yantao MA
Chinese Journal of Psychiatry 2024;57(4):207-212
Objective:To utilize a deep learning approach based on speech to distinguish between depressive and manic mood states in patients with bipolar disorder (BD).Methods:Sixty-one BD patients who visited the outpatient department of psychiatry at Peking University Sixth Hospital were recruited to participate in the study from June 2018 to March 2022. Quick Inventory of Depressive Symptomatology, Mood Disorder Questionnaire and Young Mania Rating Scale were used to determine patients′ mood states. The voices of the patients were recorded, including 190 samples during the patient′s remission, depressive, and manic mood period respectively. A total of 136 features were extracted from the voice samples, including Mel-frequency cepstral coefficients and zero-crossing rates using the speech analysis library in Python. A LIGHT-SERNET-based network was then used to train a model for emotion classification. Accuracy is used to evaluate the performance of the model, using sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and receiver operating characteristic curve (ROC) to evaluate the predictive results of model for three mood states. Kruskal-Wallis H tests or χ 2 tests were conducted to compare the differences among the demographic information of three groups. Results:There were statistically significant differences among the three groups in age ( H=25.83, P<0.001), years of education ( H=25.25, P<0.001) and marital status (χ 2=23.81, P<0.001). There is no significant difference in gender (χ 2=4.63, P=0.099). The accuracy of the model in detecting the three emotional states was 0.84. The sensitivity and specificity in detecting remission were 0.88 and 0.93, respectively, and the positive predictive value and negative predictive value were 0.87 and 0.94, respectively. The sensitivity and specificity in detecting depressive episodes were 0.82 and 0.92, respectively, and the positive predictive value and negative predictive value were 0.84 and 0.92, respectively. The sensitivity and specificity in detecting manic episodes were 0.82 and 0.91, respectively, and the positive predictive value and negative predictive value were 0.83 and 0.91, respectively. The areas of the receiver operation characteristic curve for the three mood states were similar and all exceeded 0.90. Conclusion:The LIGHT-SERNET-based deep learning model shows good discrimination ability between depressive and manic mood states based on speech analysis.

Result Analysis
Print
Save
E-mail