What question did this study set out to answer?

This research investigates how emotional features in a speaker's voice affect neural processing during speech in noisy environments.

April 3, 2026

Discovering Emotion in a Cocktail Party: How Emotional Learning Shapes Neural Dynamics in Speech-on-Speech Masking

Key Points

This research investigates how emotional features in a speaker's voice affect neural processing during speech in noisy environments.
Used an emotional learning paradigm in a speech-on-speech context with different emotional expressions.
Recorded electroencephalogram (EEG) data from healthy individuals.
Applied multivariate pattern analysis and representational similarity analysis to track emotional time dynamics.
Detected early neural responses to emotional processing between 150 and 180 ms after the onset of speech.
Found that subjective emotional valence could be decoded from brain signals as early as 94 ms.
Demonstrated that emotional cues are processed rapidly and somewhat independently from acoustic features.

Abstract

Purpose: Under a noisy environment such as a cocktail party, emotional signals play a crucial role in helping listeners unmask target speech. However, it remains unclear how emotional features carried in a speaker's vocal timbre shape neural processing over time. This study aimed to characterize the temporal neural dynamics of learned emotion with a speaker's voice in complex listening conditions. Method: We employed an emotional learning paradigm in a speech-on-speech context, pairing two different target speakers with either angry or neutral facial expressions. Electroencephalogram data were recorded from healthy participants, and multivariate pattern analysis combined with representational similarity analysis was used to track the temporal unfolding of learned emotion linked to the target speaker's voice. Results: We observed early neural signatures of emotional processing between 150 and 180 ms after stimulus onset, occurring nearly simultaneously with the decoding of speaker identity. Importantly, brain–behavior analysis revealed that subjective emotional valence ratings could be decoded from neural signals as early as 94 ms. These findings suggest that vocal emotion can be processed rapidly and in a way relatively independent to the process of low-level acoustic cues. Conclusion: Our study provides evidence that acquired emotional associations with a speaker's voice can shape early-stage neural dynamics during speech processing under challenging listening conditions. Supplemental Material: https://doi.org/10.23641/asha.31842814

Bookmark

Discovering Emotion in a Cocktail Party: How Emotional Learning Shapes Neural Dynamics in Speech-on-Speech Masking

Key Points

Abstract

Cite This Study

Also Consider

Also Consider