What question did this study set out to answer?

This research aims to uncover hidden deficiencies in the behavior of LLM agents beyond their outputs.

March 12, 2026Open Access

The Competence Illusion: Detecting Structural Failure Modes in LLM Agent Trajectories

Key Points

This research aims to uncover hidden deficiencies in the behavior of LLM agents beyond their outputs.
Introduced competence illusion to describe faulty agent behavior patterns.
Developed CAUM, a system monitoring agent trajectories through structural integrity, substance quality, and temporal analysis.
Created RS3, a suite of hand-crafted archetypes for evaluating the efficacy of CAUM.
Achieved an 88.1% illusion detection rate, significantly surpassing naive baseline methods by 38 percentage points.
Maintained a 90% clean rate across control archetypes, demonstrating effectiveness in legitimate scenarios.
Highlighted the importance of structural trajectory analysis in identifying failure modes not visible through traditional evaluations.

Abstract

Large Language Model (LLM) agents increasingly operate autonomously in production environments, yet current evaluation methods rely primarily on output-focused benchmarks that fail to capture how an agent arrives at its answer. We introduce the concept of competence illusion - agent trajectories that produce superficially correct or confident outputs while exhibiting structurally deficient behavior patterns including sycophancy, specification gaming, temporal degradation, and false confidence projection. We present CAUM (Continuous Agent Understanding Monitor), a privacy-preserving structural observation system that analyzes agent trajectories through three complementary diagnostic layers: structural integrity scoring (UDS), substance quality assessment (SQI), and temporal stability analysis (TDS). To evaluate CAUM detection capabilities, we introduce RS3, a behavioral stress-test suite of 20 hand-crafted archetypes spanning five categories. CAUM achieves an 88.1% illusion detection rate (95% CI: 83-93%), outperforming naive baselines by 38 percentage points, while maintaining a 90% clean rate across three legitimate control archetypes. Our results demonstrate that structural trajectory analysis captures failure modes invisible to output-only evaluation, establishing trajectory diagnostics as a necessary complement to existing LLM agent assessment methods. The benchmark artifacts and evaluation datasets are publicly available through this Zenodo repository.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Andres Ricardo Silva Gasca

Actions

Institutions

Caelum Research Corporation (United States)

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

The Competence Illusion: Detecting Structural Failure Modes in LLM Agent Trajectories

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study