May 2, 2024Open Access

Towards Fair and Inclusive Speech Recognition for Stuttering: Community-led Chinese Stuttered Speech Dataset Creation and Benchmarking

Key Points

Key points are not available for this paper at this time.

Abstract

Despite the widespread adoption of Automatic Speech Recognition (ASR) models in voice-operated products and conversational AI agents, current ASR models perform poorly for people who stutter. One primary cause of the performance disparity is the lack of representative stuttered speech data during the development of ASR models. This work introduces the first stuttered speech dataset in Mandarin Chinese, created by a grassroots community of Chinese-speaking people who stutter to facilitate the development of inclusive and fair speech AI. Collected from 72 speakers with a wide range of stuttering characteristics, this dataset contains speech samples of both spontaneous conversations and voice command dictations from each speaker. Our analysis of the dataset shows the diversity and variability of stuttered utterances captured, highlighting its unique value in authentically representing the stuttering community in AI data. Leveraging this dataset, we benchmark popular ASR models to understand their potential biases against disfluent speech.

Read Full Paperexternally

Demander à l'IA

Bookmark

View Full Paper

Cite This Study

Li et al. (Thu,) studied this question.

synapsesocial.com/papers/68e6bd25b6db64358763ce18 https://doi.org/https://doi.org/10.1145/3613905.3650950

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Demander à l'IA

Bookmark

View Full Paper