What question did this study set out to answer?

This study aims to investigate how different frequency and temporal regions contribute to lexical tone perception in Mandarin under noisy conditions.

May 14, 2026

Measuring spectro-temporal weighting for Mandarin lexical tone perception with bubble noise

Key Points

This study aims to investigate how different frequency and temporal regions contribute to lexical tone perception in Mandarin under noisy conditions.
Evaluated tone identification using the adaptive bubble noise technique in spectrograms.
Mixed two Mandarin vowels with four tones alongside 200 unique bubble maskers.
Calculated spectro-temporal weights by correlating listeners' accuracy with audibility at each point.
A dominant weight peak was found in the low-frequency region (0.1–0.7 kHz), indicating reliance on fundamental frequency (F0).
Specific temporal-spectral regions were shown to have varied contributions to tone identification.
Weight peaks aligned with pitch contour transitions, emphasizing the need to consider time-frequency analysis.

Abstract

Lexical tone contrasts are important for distinguishing word meanings in Chinese Mandarin. While previous studies have identified several acoustic cues for lexical tone perception (e.g., pitch height, pitch contour), it remains unclear whether listeners weigh these cues differently when identifying lexical tones in the presence of background noise. To address this, the current study evaluated the weight patterns of individual temporal-spectral regions in spectrograms for tone identification using the adaptive bubble noise technique Mandel et al., 2019. We designed a bubble masker that randomly revealed parts of the stimulus through bubbles in time-frequency domain while completely masking the rest. Each of two Mandarin vowels (/u/ and /i/) produced with four different tones was then mixed with 200 unique bubble maskers. The spectro-temporal weights were derived by correlating listener’s accuracy with the audibility of stimuli at each temporal-spectral point. Preliminary results from native Mandarin listeners with normal hearing indicate that not all temporal-spectral regions contribute equally to lexical tone identification. A dominant weight peak was observed in the low-frequency region (0.1–0.7 kHz), suggesting that listeners rely heavily on the fundamental frequency (F0) extracted from low-order harmonics. The temporally localized weight peaks appear to align with the pitch contour transitions, highlighting the importance of extending the weights from frequency-only to time-frequency space for Mandarin.

Bookmark

Measuring spectro-temporal weighting for Mandarin lexical tone perception with bubble noise

Key Points

Abstract

Cite This Study