The present study examined the influence of changes in speakers’ fundamental frequency ( f o ) and vocal tract resonance (VTR) on speech recognition in different types of noise by non-native listeners. The goal was to identify whether the f o -VTR relationship has a similar effect on non-native listeners as it does on native listeners. Twenty-six adults who were native Mandarin speakers learning English as a second language were presented with English Hearing-in-Noise Test (HINT) sentences in four voice conditions with the original male speaker's f o doubled and/or VTR scaled up by a factor of 1.2: (1) low f o low VTR (L f o L VTR , the original recordings); (2) low f o high VTR (L f o H VTR ); (3) high f o high VTR (H f o H VTR ), and (4) high f o low VTR (H f o L VTR ). The stimuli were presented in speech-shaped noise (SSN) and four-talker babble (FTB) at signal-to-noise ratios of −3, 0, +3 dB. The results showed that the non-native listeners performed more poorly with f o -VTR mismatched voices than with f o -VTR matched voices and the negative influence of mismatched voice features was mainly manifested in the H f o L VTR condition. Compared to SSN, FTB had a greater adverse impact on the non-native listeners’ recognition accuracy. Further, the performance difference between matched and mismatched conditions showed distinct patterns across SSN and FTB.
Xiao et al. (Wed,) studied this question.