This study examined the effects of change in the talker’s sex-related acoustic properties fundamental frequency (F0) and vocal tract resonance (VTR) on speech recognition in noise. The stimuli were HINT sentences with the original male talker’s F0 and VTR being manipulated (doubling F0 and/or scaling up VTR by a factor of 1.2) into four conditions: low F0 low VTR (LF0LVTR, the original recordings), low F0 high VTR (LF0HVTR), high F0 high VTR (HF0HVTR), and high F0 low VTR (HF0LVTR). Randomly selected sentences from each condition were presented to 193 adults for a gender rating task on a 7-point scale. Then, all sentences were mixed with speech-shaped noise at signal-to-noise ratios of −10, −5, 0, and +5 dB, and presented to 42 normal-hearing adults for recognition. The two conditions with matched F0 and VTR (HF0HVTR and LF0LVTR) were perceived as male or female voices and showed no significant differences in recognition accuracy and estimated speech reception thresholds. However, the mismatched conditions HF0LVTR and LF0HVTR showed reduced recognition performance and significantly higher SRTs than the matched conditions. In general, voices with matched F0 and VTR yield equivalent speech recognition in noise, whereas voices with mismatched F0 and VTR may reduce intelligibility in noise.
Yang et al. (Tue,) studied this question.