August 11, 2025Open Access

Effects of fundamental frequency and vocal tract resonance on speech recognition in noise by non-native listeners

Key Points

Non-native listeners showed reduced accuracy in speech recognition when voice features were mismatched, particularly in complex noise conditions.
Recognition performance was notably worse in the high fundamental frequency and low vocal tract resonance configuration, indicating a specific vulnerability to mismatched voice features.
Assessment using the Hearing-in-Noise Test revealed that four-talker babble produced a greater decline in recognition compared to speech-shaped noise, affecting accuracy significantly.
Findings suggest implications for improving communication strategies for non-native listeners in noisy environments, emphasizing the need for tailored speech modifications.

Abstract

The present study examined the influence of changes in speakers’ fundamental frequency ( f o ) and vocal tract resonance (VTR) on speech recognition in different types of noise by non-native listeners. The goal was to identify whether the f o -VTR relationship has a similar effect on non-native listeners as it does on native listeners. Twenty-six adults who were native Mandarin speakers learning English as a second language were presented with English Hearing-in-Noise Test (HINT) sentences in four voice conditions with the original male speaker's f o doubled and/or VTR scaled up by a factor of 1.2: (1) low f o low VTR (L f o L VTR , the original recordings); (2) low f o high VTR (L f o H VTR ); (3) high f o high VTR (H f o H VTR ), and (4) high f o low VTR (H f o L VTR ). The stimuli were presented in speech-shaped noise (SSN) and four-talker babble (FTB) at signal-to-noise ratios of −3, 0, +3 dB. The results showed that the non-native listeners performed more poorly with f o -VTR mismatched voices than with f o -VTR matched voices and the negative influence of mismatched voice features was mainly manifested in the H f o L VTR condition. Compared to SSN, FTB had a greater adverse impact on the non-native listeners’ recognition accuracy. Further, the performance difference between matched and mismatched conditions showed distinct patterns across SSN and FTB.

Read Full Paperexternally

Demander à l'IA

Bookmark

View Full Paper

Cite This Study

Xiao et al. (Wed,) studied this question.

synapsesocial.com/papers/68a35efb0a429f797332878f https://doi.org/https://doi.org/10.1051/aacus/2025025

Demander à l'IA

Bookmark

View Full Paper