January 1, 2017Open Access

Gender and Dialect Bias in YouTube's Automatic Captions

Puntos clave

Los puntos clave no están disponibles para este artículo en este momento.

Resumen

This project evaluates the accuracy of YouTube's automatically-generated captions across two genders and five dialects of English. Speakers' dialect and gender was controlled for by using videos uploaded as part of the "accent tag challenge", where speakers explicitly identify their language background. The results show robust differences in accuracy across both gender and dialect, with lower accuracy for 1) women and 2) speakers from Scotland. This finding builds on earlier research finding that speaker's sociolinguistic identity may negatively impact their ability to use automatic speech recognition, and demonstrates the need for sociolinguistically-stratified validation of systems.

Me gusta

Guardar

Ver artículo completo

Cite This Study

Rachael Tatman (Sun,) studied this question.

synapsesocial.com/papers/69dd4f3821232b10ec40c4cb https://doi.org/https://doi.org/10.18653/v1/w17-1606