What question did this study set out to answer?

This investigation aims to analyze how effectively LLMs can mimic human writing styles and assess the distinguishability of LLMs from human authors.

May 4, 2026Open Access

Breaking the Imitation Game: Can LLMs Fool Humans and Machines Alike?

Key Points

This investigation aims to analyze how effectively LLMs can mimic human writing styles and assess the distinguishability of LLMs from human authors.
Analyzed stylometric features of texts generated by 10 different LLMs and human authors.
Conducted a survey to test human recognition of text authorship.
Employed machine learning models trained on adversarially created LLM samples for improved accuracy.
Participants correctly identified LLM-generated text with only 15% accuracy.
Stylometric classifiers achieved up to 99% accuracy when trained with adversarially generated samples.

Abstract

The emergence of large language models (LLMs) has significantly advanced natural language processing (NLP); however, their capacity to generate human‐like content introduces serious security concerns. In particular, the misuse of LLMs for disinformation and impersonation on social media platforms such as X creates new opportunities for large‐scale manipulation and deception of users. This study aims to conduct a comprehensive investigation to (i) understand the distinct stylistic features effectively mimicked by 10 different LLMs and (ii) distinguish between LLM‐driven and human authors when LLMs are explicitly instructed to mimic a specific human writing style. In particular, we design adversarial prompts to mimic the writing style of human authors based on key stylometric features (quantitative analysis of writing style) and assess the mimicking effectiveness of different LLMs through extensive statistical testing. In addition, we conduct a survey that gauges human ability to recognize the author of a text and train machine learning models to identify human‐ and LLM‐driven authors, focusing on scenarios where specifically crafted adversarial prompts are employed to facilitate style impersonation. Our findings demonstrate that, when explicitly instructed, LLMs can effectively replicate features of human writing style. In addition, the survey results indicate that it is challenging for the participants to distinguish between the different types of authors. In fact, the participants only demonstrated a classification accuracy of 15% in correctly identifying the text generated by the LLM. In contrast, high detection performance is achieved only when the training data incorporate adversarially generated LLM samples produced using impersonation‐oriented prompts. Under this threat‐model–aligned training regime, stylometric‐based classifiers exhibit strong discriminative capability, attaining classification accuracy of up to 99% in distinguishing human‐authored text from LLM‐generated authorship.

Read Full Paperexternally

Mark Helpful

Bookmark

Relay

View Full Paper