1.Neural network for auditory speech enhancement featuring feedback-driven attention and lateral inhibition.
Yudong CAI ; Xue LIU ; Xiang LIAO ; Yi ZHOU
Journal of Biomedical Engineering 2025;42(1):82-89
The processing mechanism of the human brain for speech information is a significant source of inspiration for the study of speech enhancement technology. Attention and lateral inhibition are key mechanisms in auditory information processing that can selectively enhance specific information. Building on this, the study introduces a dual-branch U-Net that integrates lateral inhibition and feedback-driven attention mechanisms. Noisy speech signals input into the first branch of the U-Net led to the selective feedback of time-frequency units with high confidence. The generated activation layer gradients, in conjunction with the lateral inhibition mechanism, were utilized to calculate attention maps. These maps were then concatenated to the second branch of the U-Net, directing the network's focus and achieving selective enhancement of auditory speech signals. The evaluation of the speech enhancement effect was conducted by utilising five metrics, including perceptual evaluation of speech quality. This method was compared horizontally with five other methods: Wiener, SEGAN, PHASEN, Demucs and GRN. The experimental results demonstrated that the proposed method improved speech signal enhancement capabilities in various noise scenarios by 18% to 21% compared to the baseline network across multiple performance metrics. This improvement was particularly notable in low signal-to-noise ratio conditions, where the proposed method exhibited a significant performance advantage over other methods. The speech enhancement technique based on lateral inhibition and feedback-driven attention mechanisms holds significant potential in auditory speech enhancement, making it suitable for clinical practices related to artificial cochleae and hearing aids.
Humans
;
Attention/physiology*
;
Speech Perception/physiology*
;
Neural Networks, Computer
;
Speech
;
Noise
;
Feedback
2.Research on bimodal emotion recognition algorithm based on multi-branch bidirectional multi-scale time perception.
Peiyun XUE ; Sibin WANG ; Jing BAI ; Yan QIANG
Journal of Biomedical Engineering 2025;42(3):528-536
Emotion can reflect the psychological and physiological health of human beings, and the main expression of human emotion is voice and facial expression. How to extract and effectively integrate the two modes of emotion information is one of the main challenges faced by emotion recognition. In this paper, a multi-branch bidirectional multi-scale time perception model is proposed, which can detect the forward and reverse speech Mel-frequency spectrum coefficients in the time dimension. At the same time, the model uses causal convolution to obtain temporal correlation information between different scale features, and assigns attention maps to them according to the information, so as to obtain multi-scale fusion of speech emotion features. Secondly, this paper proposes a two-modal feature dynamic fusion algorithm, which combines the advantages of AlexNet and uses overlapping maximum pooling layers to obtain richer fusion features from different modal feature mosaic matrices. Experimental results show that the accuracy of the multi-branch bidirectional multi-scale time sensing dual-modal emotion recognition model proposed in this paper reaches 97.67% and 90.14% respectively on the two public audio and video emotion data sets, which is superior to other common methods, indicating that the proposed emotion recognition model can effectively capture emotion feature information and improve the accuracy of emotion recognition.
Humans
;
Emotions
;
Algorithms
;
Facial Expression
;
Time Perception
;
Neural Networks, Computer
;
Speech
3.A method for emotion transition recognition using cross-modal feature fusion and global perception.
Lilin JIE ; Yangmeng ZOU ; Zhengxiu LI ; Baoliang LYU ; Weilong ZHENG ; Ming LI
Journal of Biomedical Engineering 2025;42(5):977-986
Current studies on electroencephalogram (EEG) emotion recognition primarily concentrate on discrete stimulus paradigms under controlled laboratory settings, which cannot adequately represent the dynamic transition characteristics of emotional states during multi-context interactions. To address this issue, this paper proposes a novel method for emotion transition recognition that leverages a cross-modal feature fusion and global perception network (CFGPN). Firstly, an experimental paradigm encompassing six types of emotion transition scenarios was designed, and EEG and eye movement data were simultaneously collected from 20 participants, each annotated with dynamic continuous emotion labels. Subsequently, deep canonical correlation analysis integrated with a cross-modal attention mechanism was employed to fuse features from EEG and eye movement signals, resulting in multimodal feature vectors enriched with highly discriminative emotional information. These vectors are then input into a parallel hybrid architecture that combines convolutional neural networks (CNNs) and Transformers. The CNN is employed to capture local time-series features, whereas the Transformer leverages its robust global perception capabilities to effectively model long-range temporal dependencies, enabling accurate dynamic emotion transition recognition. The results demonstrate that the proposed method achieves the lowest mean square error in both valence and arousal recognition tasks on the dynamic emotion transition dataset and a classic multimodal emotion dataset. It exhibits superior recognition accuracy and stability when compared with five existing unimodal and six multimodal deep learning models. The approach enhances both adaptability and robustness in recognizing emotional state transitions in real-world scenarios, showing promising potential for applications in the field of biomedical engineering.
Humans
;
Emotions/physiology*
;
Electroencephalography
;
Neural Networks, Computer
;
Eye Movements
;
Perception
4.Neural Basis of Categorical Representations of Animal Body Silhouettes.
Neuroscience Bulletin 2025;41(2):211-223
Neural activities differentiating bodies versus non-body stimuli have been identified in the occipitotemporal cortex of both humans and nonhuman primates. However, the neural mechanisms of coding the similarity of different individuals' bodies of the same species to support their categorical representations remain unclear. Using electroencephalography (EEG) and magnetoencephalography (MEG), we investigated the temporal and spatial characteristics of neural processes shared by different individual body silhouettes of the same species by quantifying the repetition suppression of neural responses to human and animal (chimpanzee, dog, and bird) body silhouettes showing different postures. Our EEG results revealed significant repetition suppression of the amplitudes of early frontal/central activity at 180-220 ms (P2) and late occipitoparietal activity at 220-320 ms (P270) in response to animal (but not human) body silhouettes of the same species. Our MEG results further localized the repetition suppression effect related to animal body silhouettes in the left supramarginal gyrus and left frontal cortex at 200-440 ms after stimulus onset. Our findings suggest two neural processes that are involved in spontaneous categorical representations of animal body silhouettes as a cognitive basis of human-animal interactions.
Humans
;
Animals
;
Male
;
Electroencephalography
;
Magnetoencephalography
;
Female
;
Young Adult
;
Adult
;
Pattern Recognition, Visual/physiology*
;
Brain Mapping
;
Photic Stimulation
;
Brain/physiology*
;
Dogs
5.Rhythm Facilitates Auditory Working Memory via Beta-Band Encoding and Theta-Band Maintenance.
Suizi TIAN ; Yu-Ang CHENG ; Huan LUO
Neuroscience Bulletin 2025;41(2):195-210
Rhythm, as a prominent characteristic of auditory experiences such as speech and music, is known to facilitate attention, yet its contribution to working memory (WM) remains unclear. Here, human participants temporarily retained a 12-tone sequence presented rhythmically or arrhythmically in WM and performed a pitch change-detection task. Behaviorally, while having comparable accuracy, rhythmic tone sequences showed a faster response time and lower response boundaries in decision-making. Electroencephalographic recordings revealed that rhythmic sequences elicited enhanced non-phase-locked beta-band (16 Hz-33 Hz) and theta-band (3 Hz-5 Hz) neural oscillations during sensory encoding and WM retention periods, respectively. Importantly, the two-stage neural signatures were correlated with each other and contributed to behavior. As beta-band and theta-band oscillations denote the engagement of motor systems and WM maintenance, respectively, our findings imply that rhythm facilitates auditory WM through intricate oscillation-based interactions between the motor and auditory systems that facilitate predictive attention to auditory sequences.
Humans
;
Memory, Short-Term/physiology*
;
Male
;
Beta Rhythm/physiology*
;
Female
;
Theta Rhythm/physiology*
;
Young Adult
;
Auditory Perception/physiology*
;
Adult
;
Electroencephalography
;
Acoustic Stimulation
;
Reaction Time/physiology*
;
Brain/physiology*
;
Attention/physiology*
6.Functional Connectivity Encodes Sound Locations by Lateralization Angles.
Renjie TONG ; Shaoyi SU ; Ying LIANG ; Chunlin LI ; Liwei SUN ; Xu ZHANG
Neuroscience Bulletin 2025;41(2):261-271
The ability to localize sound sources rapidly allows human beings to efficiently understand the surrounding environment. Previous studies have suggested that there is an auditory "where" pathway in the cortex for processing sound locations. The neural activation in regions along this pathway encodes sound locations by opponent hemifield coding, in which each unilateral region is activated by sounds coming from the contralateral hemifield. However, it is still unclear how these regions interact with each other to form a unified representation of the auditory space. In the present study, we investigated whether functional connectivity in the auditory "where" pathway encoded sound locations during passive listening. Participants underwent functional magnetic resonance imaging while passively listening to sounds from five distinct horizontal locations (-90°, -45°, 0°, 45°, 90°). We were able to decode sound locations from the functional connectivity patterns of the "where" pathway. Furthermore, we found that such neural representation of sound locations was primarily based on the coding of sound lateralization angles to the frontal midline. In addition, whole-brain analysis indicated that functional connectivity between occipital regions and the primary auditory cortex also encoded sound locations by lateralization angles. Overall, our results reveal a lateralization-angle-based representation of sound locations encoded by functional connectivity patterns, which could add on the activation-based opponent hemifield coding to provide a more precise representation of the auditory space.
Humans
;
Sound Localization/physiology*
;
Male
;
Female
;
Magnetic Resonance Imaging
;
Young Adult
;
Functional Laterality/physiology*
;
Adult
;
Brain Mapping
;
Auditory Cortex/physiology*
;
Acoustic Stimulation
;
Auditory Pathways/physiology*
;
Brain/physiology*
7.Brain White Matter Changes in Non-demented Individuals with Color Discrimination Deficits and Their Association with Cognitive Impairment: A NODDI Study.
Jiejun ZHANG ; Peilin HUANG ; Lin LIN ; Yingzhe CHENG ; Weipin WENG ; Jiahao ZHENG ; Yixin SUN ; Shaofan JIANG ; Xiaodong PAN
Neuroscience Bulletin 2025;41(8):1364-1376
Previous studies have found associations between color discrimination deficits and cognitive impairments besides aging. However, investigations into the microstructural pathology of brain white matter (WM) associated with these deficits remain limited. This study aimed to examine the microstructural characteristics of WM in the non-demented population with abnormal color discrimination, utilizing Neurite Orientation Dispersion and Density Imaging (NODDI), and to explore their correlations with cognitive functions and cognition-related plasma biomarkers. The tract-based spatial statistic analysis revealed significant differences in specific brain regions between the abnormal color discrimination group and the healthy controls, characterized by increased isotropic volume fraction and decreased neurite density index and orientation dispersion index. Further analysis of region-of-interest parameters revealed that the isotropic volume fraction in the bilateral anterior thalamic radiation, superior longitudinal fasciculus, cingulum, and forceps minor was significantly correlated with poorer performance on neuropsychological assessments and to varying degrees various cognition-related plasma biomarkers. These findings provide neuroimaging evidence that WM microstructural abnormalities in non-demented individuals with abnormal color discrimination are associated with cognitive dysfunction, potentially serving as early markers for cognitive decline.
Humans
;
White Matter/pathology*
;
Male
;
Female
;
Cognitive Dysfunction/physiopathology*
;
Middle Aged
;
Aged
;
Color Perception/physiology*
;
Brain/pathology*
;
Neuropsychological Tests
;
Diffusion Tensor Imaging
8.Neural Dynamics of Visual Stream Interactions During Memory-Guided Actions Investigated by Intracranial EEG.
Sofiia MORARESKU ; Jiri HAMMER ; Vasileios DIMAKOPOULOS ; Michaela KAJSOVA ; Radek JANCA ; Petr JEZDIK ; Adam KALINA ; Petr MARUSIC ; Kamil VLCEK
Neuroscience Bulletin 2025;41(8):1347-1363
The dorsal and ventral visual streams have been considered to play distinct roles in visual processing for action: the dorsal stream is assumed to support real-time actions, while the ventral stream facilitates memory-guided actions. However, recent evidence suggests a more integrated function of these streams. We investigated the neural dynamics and functional connectivity between them during memory-guided actions using intracranial EEG. We tracked neural activity in the inferior parietal lobule in the dorsal stream, and the ventral temporal cortex in the ventral stream as well as the hippocampus during a delayed action task involving object identity and location memory. We found increased alpha power in both streams during the delay, indicating their role in maintaining spatial visual information. In addition, we recorded increased alpha power in the hippocampus during the delay, but only when both object identity and location needed to be remembered. We also recorded an increase in theta band phase synchronization between the inferior parietal lobule and ventral temporal cortex and between the inferior parietal lobule and hippocampus during the encoding and delay. Granger causality analysis indicated dynamic and frequency-specific directional interactions among the inferior parietal lobule, ventral temporal cortex, and hippocampus that varied across task phases. Our study provides unique electrophysiological evidence for close interactions between dorsal and ventral streams, supporting an integrated processing model in which both streams contribute to memory-guided actions.
Humans
;
Male
;
Female
;
Adult
;
Young Adult
;
Hippocampus/physiology*
;
Memory/physiology*
;
Parietal Lobe/physiology*
;
Temporal Lobe/physiology*
;
Visual Perception/physiology*
;
Electrocorticography
;
Visual Pathways/physiology*
;
Electroencephalography
9.Neural Tracking of Race-Related Information During Face Perception.
Chenyu PANG ; Na ZHOU ; Yiwen DENG ; Yue PU ; Shihui HAN
Neuroscience Bulletin 2025;41(11):1957-1976
Previous studies have identified two group-level processes, neural representations of interracial between-group difference and intraracial within-group similarity, that contribute to the racial categorization of faces. What remains unclear is how the brain tracks race-related information that varies across different faces as an individual-level neural process involved in race perception. In three studies, we recorded functional MRI signals when Chinese adults performed different tasks on morphed faces in which proportions of pixels contributing to perceived racial identity (Asian vs White) and expression (pain vs neutral) varied independently. We found that, during a pain expression judgment task, tracking other-race and same-race-related information in perceived faces recruited the ventral occipitotemporal cortices and medial prefrontal/anterior temporal cortices, respectively. However, neural tracking of race-related information tended to be weakened during explicit race judgments on perceived faces. During a donation task, the medial prefrontal activity also tracked race-related information that distinguished between two perceived faces for altruistic decision-making and encoded the Euclidean distance between the two faces that predicted decision-making speeds. Our findings revealed task-dependent neural mechanisms underlying the tracking of race-related information during face perception and altruistic decision-making.
Adult
;
Female
;
Humans
;
Male
;
Young Adult
;
Brain/diagnostic imaging*
;
Brain Mapping
;
Decision Making/physiology*
;
Facial Recognition/physiology*
;
Judgment/physiology*
;
Magnetic Resonance Imaging
;
Photic Stimulation
;
Racial Groups
;
Social Perception
;
East Asian People
10.Interoceptive Dysfunction in Psychiatric Disorders and Non-invasive Neuromodulation for Improving Interoception.
Huiru CUI ; Jijun WANG ; Chunbo LI
Neuroscience Bulletin 2025;41(8):1487-1499
Dysfunction of the interoceptive system is recognized as an important component of clinical symptoms, including anxiety, depression, psychosis, and other mental disorders. Non-invasive neuromodulation is an emerging clinical intervention approach, and over the past decade, research on non-invasive neuromodulation aimed at regulating interoception has rapidly developed. This review first outlines the pathways of interoceptive signals and assessment methods, then summarizes the interoceptive abnormalities in psychiatric disorders and current studies for non-invasive neuromodulation targeting interoception, including intervention modes, target sites, interoceptive measures, and potential neurobiological mechanisms. Finally, we discuss significant research challenges and future directions.
Humans
;
Interoception/physiology*
;
Mental Disorders/therapy*

Result Analysis
Print
Save
E-mail