Large Language Model (LLM) agents increasingly operate autonomously in production environments, yet current evaluation methods rely primarily on output-focused benchmarks that fail to capture how an agent arrives at its answer. We introduce the concept of competence illusion - agent trajectories that produce superficially correct or confident outputs while exhibiting structurally deficient behavior patterns including sycophancy, specification gaming, temporal degradation, and false confidence projection. We present CAUM (Continuous Agent Understanding Monitor), a privacy-preserving structural observation system that analyzes agent trajectories through three complementary diagnostic layers: structural integrity scoring (UDS), substance quality assessment (SQI), and temporal stability analysis (TDS). To evaluate CAUM detection capabilities, we introduce RS3, a behavioral stress-test suite of 20 hand-crafted archetypes spanning five categories. CAUM achieves an 88.1% illusion detection rate (95% CI: 83-93%), outperforming naive baselines by 38 percentage points, while maintaining a 90% clean rate across three legitimate control archetypes. Our results demonstrate that structural trajectory analysis captures failure modes invisible to output-only evaluation, establishing trajectory diagnostics as a necessary complement to existing LLM agent assessment methods. The benchmark artifacts and evaluation datasets are publicly available through this Zenodo repository.
Building similarity graph...
Analyzing shared references across papers
Loading...
Andres Ricardo Silva Gasca
Caelum Research Corporation (United States)
Building similarity graph...
Analyzing shared references across papers
Loading...
Andres Ricardo Silva Gasca (Mon,) studied this question.
www.synapsesocial.com/papers/69b2586696eeacc4fcec7f5f — DOI: https://doi.org/10.5281/zenodo.18927885