This technical note reports a small pilot benchmark examining decision reconstruction from autonomous-vehicle disengagement records. The experiment compares two representations of the same event records: 1. Baseline narrative records2. Structured Decision Event Records (DER) A sample of 90 cases was drawn from a larger AV disengagement corpus. Large language models were asked to reconstruct five decision stages: D1 Trigger D2 Evidence D3 Action D4 Recovery D5 Root cause Exact-agreement accuracy was measured against manually constructed gold labels. Pilot result: Baseline narrative reconstruction accuracy: 0.9517 DER structured reconstruction accuracy: 0.9426 The difference is small but indicates that over-structured templates may dilute linguistic cues required for decision reconstruction. This record is released as a pilot empirical observation and does not claim generalizable performance results. The purpose of this release is to document the benchmark design and provide a minimal empirical reference point for future work on decision transparency interfaces. Within the broader research programme, DER (Decision-Event Record) should be understood as a methodological instrumentation line derived from the TPT / TBT transparency architecture. This record is a scoped empirical note and does not restate the full TPT / TBT theory layer. Further experiments will explore DER variants designed to preserve narrative cues while exposing decision structure. Indexed in the project Root Index ( https://doi.org/10.5281/zenodo.17992916). Author: Hon Bor So Note: The author also publishes under the name Sing So (Chöndrel Dorje). ORCID: 0009-0008-2768-7494
Hon Bor So (Sat,) studied this question.