Purpose: Under a noisy environment such as a cocktail party, emotional signals play a crucial role in helping listeners unmask target speech. However, it remains unclear how emotional features carried in a speaker's vocal timbre shape neural processing over time. This study aimed to characterize the temporal neural dynamics of learned emotion with a speaker's voice in complex listening conditions. Method: We employed an emotional learning paradigm in a speech-on-speech context, pairing two different target speakers with either angry or neutral facial expressions. Electroencephalogram data were recorded from healthy participants, and multivariate pattern analysis combined with representational similarity analysis was used to track the temporal unfolding of learned emotion linked to the target speaker's voice. Results: We observed early neural signatures of emotional processing between 150 and 180 ms after stimulus onset, occurring nearly simultaneously with the decoding of speaker identity. Importantly, brain–behavior analysis revealed that subjective emotional valence ratings could be decoded from neural signals as early as 94 ms. These findings suggest that vocal emotion can be processed rapidly and in a way relatively independent to the process of low-level acoustic cues. Conclusion: Our study provides evidence that acquired emotional associations with a speaker's voice can shape early-stage neural dynamics during speech processing under challenging listening conditions. Supplemental Material: https://doi.org/10.23641/asha.31842814
Lu et al. (Wed,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: