What question did this study set out to answer?

This research aims to evaluate how diverse instruction patterns affect the performance of LLM-generated text detectors.

March 17, 2026Open Access

On the Robustness of LLM-Generated Text Detection Against Instruction Diversity

Key Points

This research aims to evaluate how diverse instruction patterns affect the performance of LLM-generated text detectors.
Systematically examined detectors' robustness to instruction diversity.
Created manual task-oriented constraints based on essay quality factors.
Conducted experiments focusing on student essay writing to assess detection performance.
Detectors showed large variance in performance under diverse instruction constraints.
Standard deviation of detection performance reached up to 14.4 F1-score with task constraints.
Overall trend indicated that constraints made LLM detection more challenging.

Abstract

To combat the misuse of large language models (LLMs), many recent studies have presented LLM-generated text detectors with promising performance. When users instruct LLMs to generate text, the instruction can include different constraints depending on the user's needs. Prior studies on prompt sensitivity have shown that even small differences in instructions can substantially alter the quality and characteristics of generated texts. However, most recent studies have not covered such diverse instruction patterns when creating datasets for LLM detection. In this study, we systematically examined the robustness of detectors to instruction diversity through task-oriented constraints that naturally appear in instructions but are not related to detection evasion. We demonstrated that even powerful detectors exhibit a large variance in detection performance under such constraints. Focusing on student essay writing as a realistic domain, task-oriented constraints were manually created based on several essay quality factors. Our experiments showed that the standard deviation (SD) of the current detectors' performance on texts generated by an instruction with such a constraint is significantly larger (up to an SD of 14.4 F1-score) than that of generating texts multiple times or paraphrasing the instruction. We also observed an overall trend in which the constraints made LLM detection more challenging than without them. Our analysis suggests that this variance cannot be attributed to trivial output variation across constraints or fluctuations due to the average performance level, but instead stems from vulnerabilities in detectors specific to these constraints. In particular, detectors exhibit large performance degradation under constraints on the vocabulary or style of the generated texts. Finally, to better understand this effect, we found that the high instruction-following ability of LLMs fosters a large impact of such constraints on detection performance. To facilitate further development of robust detectors against diverse instructions, we released our datasets at https://github.com/ryuryukke/HowYouPromptMatters.

Bookmark

View Full Paper

Bookmark

View Full Paper

On the Robustness of LLM-Generated Text Detection Against Instruction Diversity

Key Points

Abstract

Cite This Study