March 18, 2024Open Access

Zero Resource Code-Switched Speech Benchmark Using Speech Utterance Pairs for Multiple Spoken Languages

Key Points

Key points are not available for this paper at this time.

Abstract

We introduce a new zero resource code-switched speech bench-mark designed to assess the code-switching capabilities of self-supervised speech encoders directly. We showcase a baseline system of language modeling on discrete units to demonstrate how the code-switching abilities of speech encoders can be assessed in a zero-resource manner. Our experiments encompass a variety of well-known speech encoders, including Wav2vec 2.0, HuBERT, XLSR, etc., on three tracks of different code-switched language pairs: Spanish-English, French-English, and Chinese-English. We examine the impact of pre-training languages and model size on benchmark performance. Notably, though our results demonstrate that speech encoders with multilingual pre-training, exemplified by XLSR, outperform monolingual variants (Wav2vec 2.0, HuBERT) in code-switching scenarios, there is still substantial room for improvement in their code-switching linguistic abilities.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Kuan-Po Huang

Chih-Kai Yang

Yu-Kuan Fu

Actions

Institutions

University of Toronto

National Taiwan University

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Zero Resource Code-Switched Speech Benchmark Using Speech Utterance Pairs for Multiple Spoken Languages

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study