What question did this study set out to answer?

This research aims to detect Arabic poetry generated by AI platforms by analyzing reusability and prosodic errors.

June 14, 2026

Arabic AI Poem Detection Based on Reusability and Prosody

Key Points

This research aims to detect Arabic poetry generated by AI platforms by analyzing reusability and prosodic errors.
Extracted an 11-dimensional feature vector focusing on character-level n-gram reuse and prosodic features.
Utilized various machine learning classifiers to analyze extracted features.
Evaluated detection accuracy on both training and unseen AI-generated poems.
Achieved 99.5% accuracy in detecting poems from ChatGPT-4o.
Reached 92.9% accuracy for poems from five unseen platforms: ChatGPT-V5, Microsoft Copilot, Google Gemini, DeepSeek, and xAI Grok.
Demonstrated cross-platform generalization in content generation methodologies.

Abstract

With the rapid advancement of Large Language Model (LLM) platforms, these systems have become increasingly capable of producing literature and poetry. However, the artificial imprint remains evident in synthetic poetry, as it must adhere to specific structural rules. To comply with these rules, AI platforms rely on reusing lexical and sub-word expressions, producing verse of an acceptable but statistically distinct character. This study addresses the specific case of Arabic poetry, where these platforms have not yet achieved the capability to produce poetry entirely free of prosodic errors. This research leverages both of these tendencies—elevated reuse rates and prosodic inconsistencies—to detect Arabic poems generated by various AI platforms. Detection is performed by extracting an 11-dimensional feature vector that captures the degree of character-level n-gram reuse against AI and human reference corpora, alongside features derived from the prosodic transformation of the poem. The study demonstrated that feeding this feature vector into different machine learning classifiers can yield an accuracy of up to 99.5% when detecting poems from ChatGPT-4o, the platform used for training. The accuracy reached 92.9% when detecting poems from five unseen platforms—ChatGPT-V5, Microsoft Copilot, Google Gemini, DeepSeek, and xAI Grok—whose generated content was not used in training. This cross-platform generalization indicates a behavioral similarity in content generation methodologies across different AI systems.

Mark Helpful

Bookmark

Relay