What question did this study set out to answer?

This research aims to address the reliability gap in language model agents by developing an internal forward model to predict outcomes of actions before execution.

May 27, 2026Open Access

Outcome-Aware Agents: From Token Prediction to Action Consequence Modeling

Key Points

This research aims to address the reliability gap in language model agents by developing an internal forward model to predict outcomes of actions before execution.
Proposed the Action-Conditioned Latent Structural Causal Model (AC-LSCM) using a learned sparse directed acyclic graph (DAG).
Evaluated AC-LSCM on synthetic structural causal models with a focus on safety-critical agent planning tasks.
Conducted ablation studies to identify effective architectural components.
AC-LSCM reduces safety violations by approximately 36 times compared to a Transformer baseline (mean 0.005 vs. 0.180).
Twelve of thirteen seeds achieve zero safety violations, confirming architectural efficacy over data volume.
Ablations indicate that specific components are over-engineered, recommending their removal for a simplified architecture.

Abstract

Large language models are increasingly deployed as agents. They write code that executes, run database migrations, navigate browsers, and operate on production systems. Their training objective, next-token prediction, contains no signal about what their actions actually do. A code agent that proposes DROP TABLE has not modeled the resulting database state. It has predicted a plausible token sequence. We argue this is the central reliability gap in current agent systems, and that closing it requires giving agents an internal forward model: given a candidate action, predict the outcome before executing. We propose the Action-Conditioned Latent Structural Causal Model (AC-LSCM) as one such mechanism. AC-LSCM maintains a small set of latent factors related by a learned sparse directed acyclic graph (DAG), and implements actions as structural interventions in the sense of Pearl's do-operator rather than as context concatenation. We evaluate AC-LSCM on synthetic structural causal models and report two findings. First, on a safety-critical agent planning task, AC-LSCM reduces safety violations by roughly 36x relative to a Transformer baseline (mean 0.005 versus 0.180 across 13 seeds). Twelve of those 13 seeds produce zero safety violations. An attribution control confirms that the result is driven by the architectural design, not by data volume. Second, the architecture as originally specified is over-engineered. Ablations show the do-operator and the abduction loop carry the result, while the NOTEARS DAG constraint and a contrastive hinge term are net-negative interventions that we recommend removing. We report the negative results in full and propose a simplified follow-up architecture. Training is unstable at the scales tested: roughly a third of seeds fail to produce a usable planner despite normal training-time MSE. Even on those failing seeds, safety violation rates remain below the Transformer baseline. Code, configs, and per-seed result JSONs accompany the preprint. All experiments ran on a single NVIDIA Tesla T4 in fp32.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Mallesh Madapathi (Mon,) studied this question.

synapsesocial.com/papers/6a168ab40c924ddd1bd596ed https://doi.org/https://doi.org/10.5281/zenodo.20379455

Bookmark

View Full Paper