November 25, 2025Open Access

Tuning of language models in Eastern European languages on Twitter/X

Key Points

Fine-tuning improved sentiment analysis scores with 600 tweets, enhancing performance of transfer-learning models.
Transfer-learning models, including BERT and BERTweet, were analyzed for efficiency in sentiment analysis on Twitter/X.
The study explores several factors influencing fine-tuning success in underrepresented languages.
Results indicate translating from underrepresented languages into English enhances model accuracy, despite prior multilingual successes.

Abstract

We address the problem of fine-tuning large language models (LLMs) for sentiment analysis on Twitter/X in underrepresented Eastern European languages (Czech, Slovak, Polish, and Hungarian). We study the influence of a number of experimental settings on the efficiency of fine-tuning in two groups of LLMs: transfer-learning models (BERT, BERTweet or XLM-T, the latter two pre-trained on a Twitter corpus) and popular mid-sized universal models (Llama, Mistral). We show that adapter fine-tuning with as few as ≈ 600 tweets improved scores of our universal models to the level previously reported by Twitter/X-specialised models on popular datasets, while our transfer-learning models performed worse. We also show that, despite previous successful experiments with multilingual models, translating from underrepresented languages into English still improves the results of all models tested. Several other factors that influence the success of fine-tuning are also included in the study.

Read Full Paperexternally

Ask AI

Helpful

Bookmark

View Full Paper