February 22, 2024Open Access

Zero-shot cross-lingual transfer in instruction tuning of large language model

Key Points

Key points are not available for this paper at this time.

Abstract

Instruction tuning (IT) is widely used to teach pretrained large language models (LLMs) to follow arbitrary instructions, but is under-studied in multilingual settings. In this work, we conduct a systematic study of zero-shot cross-lingual transfer in IT, when an LLM is instruction-tuned on English-only data and then tested on user prompts in other languages. We investigate the influence of model configuration choices and devise a multi-facet evaluation strategy for multilingual instruction following. We find that cross-lingual transfer does happen successfully in IT even if all stages of model training are English-centric, but only if multiliguality is taken into account in hyperparameter tuning and with large enough IT data. English-trained LLMs are capable of generating correct-language, comprehensive and helpful responses in the other languages, but suffer from low factuality and may occasionally have fluency errors.

Read Full Paperexternally

KI fragen

Bookmark

View Full Paper

Cite This Study

Chirkova et al. (Thu,) studied this question.

synapsesocial.com/papers/68e781fab6db6435876f55e1 https://doi.org/https://doi.org/10.48550/arxiv.2402.14778

KI fragen

Bookmark

View Full Paper