Lexical tone contrasts are important for distinguishing word meanings in Chinese Mandarin. While previous studies have identified several acoustic cues for lexical tone perception (e.g., pitch height, pitch contour), it remains unclear whether listeners weigh these cues differently when identifying lexical tones in the presence of background noise. To address this, the current study evaluated the weight patterns of individual temporal-spectral regions in spectrograms for tone identification using the adaptive bubble noise technique Mandel et al., 2019. We designed a bubble masker that randomly revealed parts of the stimulus through bubbles in time-frequency domain while completely masking the rest. Each of two Mandarin vowels (/u/ and /i/) produced with four different tones was then mixed with 200 unique bubble maskers. The spectro-temporal weights were derived by correlating listener’s accuracy with the audibility of stimuli at each temporal-spectral point. Preliminary results from native Mandarin listeners with normal hearing indicate that not all temporal-spectral regions contribute equally to lexical tone identification. A dominant weight peak was observed in the low-frequency region (0.1–0.7 kHz), suggesting that listeners rely heavily on the fundamental frequency (F0) extracted from low-order harmonics. The temporally localized weight peaks appear to align with the pitch contour transitions, highlighting the importance of extending the weights from frequency-only to time-frequency space for Mandarin.
Y et al. (Wed,) studied this question.