What type of study is this?

This is a In Vitro Study study.

August 16, 2025Open Access

Fine-Tuning DeepSpeech Speech-To-Text Model for Nigerian English and Yoruba-English Code-Switched Speech

Key Points

The study shows improved transcription performance with a Word Error Rate of 0.76 after fine-tuning.
Key metrics indicate modest enhancements in accuracy for Yoruba-English code-switched speech with the adapted model.
Transfer learning was employed through iterative training on a custom dataset of approximately 118 minutes.
Constraints like limited computing power and dataset size highlight the challenges in low-resource language processing.

Abstract

Speech-to-Text (STT) systems, despite their stellar performance in recent years, still struggle with recognising non-Western English accents and speech that features Code-Switching (CS), a linguistic phenomenon common in regions such as Nigeria. This study addresses that challenge for Nigerian English and Yoruba-English code-switched speech by adapting Mozilla’s DeepSpeech 0.9.3 model and fine-tuning it using a custom dataset of 118 minutes (approximately 1.97 hours). This process involved transfer learning and hyperparameter optimisation over iterative training sessions on a CPU-based setup. The model’s performance was evaluated using Word Error Rate (WER) and Character Error Rate (CER), with the best model showing modest improvements over the baseline model and achieving a WER of 0.760261 and CER of 0.381241 after 55 epochs. Although limited computing resources and the small dataset imposed significant constraints on the work, the study demonstrated the potential of fine-tuning and transfer learning for model adaptation to low-resource languages and code-switching contexts. Future work will require access to GPU resources for improved convergence and transcription accuracy, an expanded dataset and support for Yoruba diacritics to improve the quality of transcriptions.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Olorunshola et al. (Thu,) studied this question.

synapsesocial.com/papers/68a366b20a429f797332ce5b https://doi.org/https://doi.org/10.9734/ajrcos/2025/v18i8737

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Bookmark

View Full Paper