What type of study is this?

This is a Qualitative Study study.

What question did this study set out to answer?

The study aims to improve translation performance for Kimbundu, a low-resource language, using multilingual NMT models.

March 8, 2026Open Access

Improving Kimbundu–Portuguese Neural Machine Translation through Fine-Tuning of Multilingual Models

Puntos clave

The study aims to improve translation performance for Kimbundu, a low-resource language, using multilingual NMT models.
Developed a Kimbundu–Portuguese parallel corpus
Utilized the NLLB200 architecture for fine-tuning
Employed parameter-efficient techniques using QLoRA
Tested translations against a professional review set of 1,000 sentence pairs
Achieved +10.1 BLEU score improvement
Gained +13.2 chrF score
Semantic analysis indicated consistent metric growth
Qualitative assessments highlighted better handling of Kimbundu morphology

Resumen

Neural Machine Translation (NMT) systems have achieved strong performance for highresource languages; however, many African languages remain underrepresented due to the scarcity of high-quality parallel data. Kimbundu, a Bantu language spoken in Angola, is one such low-resource language with limited machine translation support.In this work, we introduce a manually curated and humanreviewed Kimbundu–Portuguese parallel corpus and investigate its use for fine-tuning multilingual NMT models. By leveraging the NLLB200 (600M) architecture, we employ parameterefficient fine-tuning with QLoRA to adapt the model to the Kimbundu→Portuguese direction. Experimental results on a professionally reviewed test set of 1,000 sentence pairs demonstrate substantial improvements over strong multilingual baselines, with gains of +10.1 BLEU and +13.2 chrF. Furthermore, semantic metrics—including COMET, AfriCOMET, and BERTScore—show consistent growth, while qualitative analysis confirms better handling of Kimbundu’s complex morphology. These findings suggest that high-quality human reviewed data, combined with efficient fine-tuning, is a viable path to bridging the digital divide for low-resource African languages.

Leer artículo completoexternamente

Me gusta

Guardar

Ver artículo completo