What type of study is this?

This is a Quantitative Study study (also classified as: Experimental Study).

September 22, 2025Open Access

Reasoning Models Can be Easily Hacked by Fake Reasoning Bias

Key Points

Reasoning-specialized LRMs are more susceptible to reasoning theater bias than general-purpose LLMs, especially in subjective tasks.
A task-dependent trade-off exists, where LRMs perform better on factual tasks but struggle with subjective tasks due to their biases.
'Shallow reasoning' is identified as a potent form of reasoning theater bias, highlighting the flaws in seemingly plausible arguments.
Two prompting strategies were designed, showing a limited increase in accuracy on factual tasks but minimal improvement in subjective tasks.

Abstract

Large Reasoning Models (LRMs) like DeepSeek-R1 and o1 are increasingly used as automated evaluators, raising critical questions about their vulnerability to the aesthetics of reasoning in LLM-as-a-judge settings. We introduce THEATER, a comprehensive benchmark to systematically evaluate this vulnerability-termed Reasoning Theater Bias (RTB)-by comparing LLMs and LRMs across subjective preference and objective factual datasets. Through investigation of six bias types including Simple Cues and Fake Chain-of-Thought, we uncover three key findings: (1) in a critical paradox, reasoning-specialized LRMs are consistently more susceptible to RTB than general-purpose LLMs, particularly in subjective tasks; (2) this creates a task-dependent trade-off, where LRMs show more robustness on factual tasks but less on subjective ones; and (3) we identify 'shallow reasoning'-plausible but flawed arguments-as the most potent form of RTB. To address this, we design and evaluate two prompting strategies: a targeted system prompt that improves accuracy by up to 12% on factual tasks but only 1-3% on subjective tasks, and a self-reflection mechanism that shows similarly limited effectiveness in the more vulnerable subjective domains. Our work reveals that RTB is a deep-seated challenge for LRM-based evaluation and provides a systematic framework for developing more genuinely robust and trustworthy LRMs.

Reasoning Models Can be Easily Hacked by Fake Reasoning Bias

Key Points

Abstract

Cite This Study

Also Consider

Also Consider