What question did this study set out to answer?

This work aims to systematically review and compare large language models tailored for the Arabic language and its dialects.

April 11, 2026

A Survey of Large Language Models for Arabic Language and its Dialects

Key Points

This work aims to systematically review and compare large language models tailored for the Arabic language and its dialects.
Categorized models by architecture: encoder-only, decoder-only, and encoder-decoder.
Analyzed monolingual, bilingual, and multilingual models.
Evaluated model performance on sentiment analysis, named entity recognition, and question answering.
Assessed model openness regarding access to source code and training data.
Found a concentration of resources on Modern Standard Arabic.
Highlighted a lack of diverse dialectal datasets.
Identified limited transparency in many models.
Outlined challenges and research opportunities for inclusive Arabic NLP.

Abstract

This survey presents a comprehensive review of Large Language Models (LLMs) developed for the Arabic language and its dialects. It categorizes models by architecture (encoder-only, decoder-only, and encoder-decoder) and by linguistic form, including Classical Arabic, Modern Standard Arabic, and Dialectal Arabic. We analyze monolingual, bilingual, and multilingual models, evaluating their performance on tasks such as sentiment analysis, named entity recognition, and question answering. The survey also assesses model openness, considering factors like access to source code, training data, weights, and documentation. Our findings highlight a concentration of resources on MSA, a lack of diverse dialectal datasets, and limited transparency across many models. This work offers the first systematic comparison of openness and linguistic coverage in Arabic LLMs and outlines key challenges and research opportunities to support more inclusive, reproducible, and representative Arabic NLP.

اسأل الذكاء الاصطناعي

Bookmark

اسأل الذكاء الاصطناعي

Bookmark

A Survey of Large Language Models for Arabic Language and its Dialects

Key Points

Abstract

Cite This Study