What question did this study set out to answer?

The research aims to develop effective detection mechanisms for identifying AI-generated essays in Arabic educational contexts.

March 23, 2026Open Access

Student or AI? Automated Detection of AI-generated Student Essays

Key Points

The research aims to develop effective detection mechanisms for identifying AI-generated essays in Arabic educational contexts.
Fine-tuned large language models (LLMs) for detecting AI-generated essays
Utilized CAMeLBERT-based models for binary classification
Created three unique datasets to represent various AI-generation scenarios
Achieved an average detection accuracy of 95.5% across all datasets
Validated the effectiveness of fine-tuned Arabic language models
Provided a practical tool for educational institutions to combat academic dishonesty

Abstract

The rapid proliferation of open-source Large Language Models (LLMs), including ChatGPT, Gemini, and others, has revolutionized research and educational practices while simultaneously introducing unprecedented challenges to academic integrity. The increasing misuse of these models to generate fraudulent student essays that exhibit sophisticated authorship deception necessitates the development of robust detection mechanisms. Despite growing concerns, existing literature lacks timely solutions for identifying AI-generated academic content, particularly in non-English contexts such as Arabic, where linguistic complexities and limited resources compound the challenge. This study addresses this critical gap by fine-tuning LLMs to detect AI-generated student essays in Arabic educational settings. We introduce three novel datasets specifically designed to capture diverse AI-generation scenarios in academic writing. Our methodology employs CAMeLBERT-based models, fine-tuned for binary classification tasks that distinguish between human-authored and AI-generated essays. Experimental results demonstrate high performance across all three datasets, achieving an average accuracy of 95.5%, which validates both the effectiveness of our approach and its adaptability to various detection scenarios. The contributions of this work are: (1) we present a comprehensive framework for detecting AI-generated Arabic student essays, (2) we create and publicly release three benchmark datasets to facilitate future research in this domain, and (3) we demonstrate that fine-tuned Arabic language models can achieve near-perfect detection accuracy, providing educational institutions with a practical tool for safeguarding academic integrity in the era of generative AI.

Student or AI? Automated Detection of AI-generated Student Essays

Key Points

Abstract

Cite This Study