1.Research on bimodal emotion recognition algorithm based on multi-branch bidirectional multi-scale time perception.
Peiyun XUE ; Sibin WANG ; Jing BAI ; Yan QIANG
Journal of Biomedical Engineering 2025;42(3):528-536
Emotion can reflect the psychological and physiological health of human beings, and the main expression of human emotion is voice and facial expression. How to extract and effectively integrate the two modes of emotion information is one of the main challenges faced by emotion recognition. In this paper, a multi-branch bidirectional multi-scale time perception model is proposed, which can detect the forward and reverse speech Mel-frequency spectrum coefficients in the time dimension. At the same time, the model uses causal convolution to obtain temporal correlation information between different scale features, and assigns attention maps to them according to the information, so as to obtain multi-scale fusion of speech emotion features. Secondly, this paper proposes a two-modal feature dynamic fusion algorithm, which combines the advantages of AlexNet and uses overlapping maximum pooling layers to obtain richer fusion features from different modal feature mosaic matrices. Experimental results show that the accuracy of the multi-branch bidirectional multi-scale time sensing dual-modal emotion recognition model proposed in this paper reaches 97.67% and 90.14% respectively on the two public audio and video emotion data sets, which is superior to other common methods, indicating that the proposed emotion recognition model can effectively capture emotion feature information and improve the accuracy of emotion recognition.
Humans
;
Emotions
;
Algorithms
;
Facial Expression
;
Time Perception
;
Neural Networks, Computer
;
Speech
2.A multiscale feature extraction algorithm for dysarthric speech recognition.
Jianxing ZHAO ; Peiyun XUE ; Jing BAI ; Chenkang SHI ; Bo YUAN ; Tongtong SHI
Journal of Biomedical Engineering 2023;40(1):44-50
In this paper, we propose a multi-scale mel domain feature map extraction algorithm to solve the problem that the speech recognition rate of dysarthria is difficult to improve. We used the empirical mode decomposition method to decompose speech signals and extracted Fbank features and their first-order differences for each of the three effective components to construct a new feature map, which could capture details in the frequency domain. Secondly, due to the problems of effective feature loss and high computational complexity in the training process of single channel neural network, we proposed a speech recognition network model in this paper. Finally, training and decoding were performed on the public UA-Speech dataset. The experimental results showed that the accuracy of the speech recognition model of this method reached 92.77%. Therefore, the algorithm proposed in this paper can effectively improve the speech recognition rate of dysarthria.
Humans
;
Dysarthria/diagnosis*
;
Speech
;
Speech Perception
;
Algorithms
;
Neural Networks, Computer
3.An acoustic-articulatory study of the nasal finals in students with and without hearing loss.
Qing WANG ; Jing BAI ; Peiyun XUE ; Xueying ZHANG ; Pei FENG
Journal of Biomedical Engineering 2018;35(2):198-205
The central aim of this experiment was to compare the articulatory and acoustic characteristics of students with normal hearing (NH) and school aged children with hearing loss (HL), and to explore the articulatory-acoustic relations during the nasal finals. Fourteen HL and 10 control group were enrolled in this study, and the data of 4 HL students were removed because of their high pronunciation error rate. Data were collected using an electromagnetic articulography. The acoustic data and kinematics data of nasal finals were extracted by the phonetics and data processing software, and all data were analyzed by test and correlation analysis. The paper shows that, the difference was statistically significant ( <0.05 or <0.01) in different vowels under the first two formant frequencies (F1, F2), the tongue position and the articulatory-acoustic relations between HL and NH group. The HL group's vertical movement data-F1 relations in /en/ and /eng/ are same as NH group. The conclusion of this study about participants with HL can provide support for speech healing training at increasing pronunciation accuracy in HL participants.
4.AN IMMUNOHISTOCHEMICAL STUDY ON TROPIC 1808 GENE EXPRESSIONAL PROTEIN IN THE LESIONED SCIATIC NERVE
Xue CHEN ; Peiyun ZHANG ; Xiaodong WANG ; Jian WU
Acta Anatomica Sinica 1955;0(03):-
Objective To study the expression and the distribution of Tropic 1808 gene expressional protein in the lesioned sciatic nerves of adult SD rats. Methods Both distal and proximal segments of 12 days cut-off sciatic nerves of SD rats were obtained.Then immunohistochemistry and dual fluorescent-immunohistochemistry were performed to observe the expression of Tropic 1808 gene expressional protein in the lesioned sciatic nerves.The positive intensity and area of immuno-response were detected by the image analyses system. Results The Schwann cells's membranes which in the distal and proximal of lesioned sciatic nerves appeared to have positive expression of Tropic 1808 gene expressional protein.The positive intensity and area of immuno-response in the distal end were denser and larger than those in the proximal.Conclusion Tropic1808 gene expressional protein expresses in the Schwann cell's membranes of the distal and the proximal after the peripheral nerves were injured.The positive expression in the distal was higher than that in the proximal.;

Result Analysis
Print
Save
E-mail