GPTZero is an AI detection platform that scans written text for statistical signatures of machine generation and returns a probability score estimating whether it was produced by a human or an AI. In higher education, many teachers have turned to AI detection as a first-line response to the integrity crisis triggered by large language models. However, empirical findings on GPTZero’s efficacy are notably mixed. Some studies report strong diagnostic value under controlled conditions, while others document substantial false-negative rates, near-random performance on certain AI-generated essays, and frequent misclassification of AI-translated texts across several languages. Multilingual and L2 writers often bear the greatest cost, as their carefully constructed English is sometimes assigned high AI-likelihood scores because their linguistic profiles may appear less natural to models trained predominantly on standard or formulaic patterns of written English. In developing countries, where students commonly write in English as a second or third language, these limitations represent more than minor technical issues; they raise concerns about equity, potentially placing disproportionate burdens on writers working to meet academic language expectations. This article argues that GPTZero is unsuitable as a definitive tool for high-stakes assessment of writing. Instead, it proposes a shift toward postplagiarism frameworks that recognize responsible AI use. Within this approach, AI detection outputs serve as formative resources for developing critical AI literacy rather than surveillance tools. Flagged content becomes a starting point for metacognitive dialogue, which supports trust-based pedagogies that emphasize student agency and intellectual accountability.
Giray et al. (Mon,) studied this question.