Vowel formants may be encoded peripherally via local increases in firing rate (rate-place code) or decreases in neural fluctuations for fibers tuned near formants, relative to fibers tuned between formants. To distinguish between these alternatives, we compared the effects of intensity and amplitude-modulation cues on listeners’ ability to identify synthetic vowels. Harmonic tone complexes were generated with a slope of 6 dB/octave, and formants were defined by either (1) increased intensity at formant frequencies or (2) changes in phase relationships between components to manipulate modulation depth at formant frequencies. Vowels were either (1) presented with intensity cues, but with modulation cues in opposition (providing more, rather than less, modulation depth at the formant frequencies), or (2) defined solely by reductions in amplitude modulation at the formant frequencies. Two control conditions used uniform cosine or random phase with standard intensity increases for formants. Preliminary results suggest that listeners can perceive vowels without intensity increments, based only on local changes in modulation depth, that phase changes counteracting intensity cues worsen performance, and that performance with random-phase stimuli is poorer than with cosine-phase stimuli, consistent with an important role for amplitude-modulation cues in the identification of synthetic vowels. Supported by NIH, Grant R01DC012262.
Maxwell et al. (Tue,) studied this question.