This technical note reports a small pilot benchmark examining decision reconstruction from autonomous-vehicle disengagement records. The experiment compares two representations of the same event records: 1. Baseline narrative records2. Structured Decision Event Records (DER) A sample of 90 cases was drawn from a larger AV disengagement corpus. Large language models were asked to reconstruct five decision stages: D1 Trigger D2 Evidence D3 Action D4 Recovery D5 Root cause Exact-agreement accuracy was measured against manually constructed gold labels. Pilot result: Baseline narrative reconstruction accuracy: 0.9517 DER structured reconstruction accuracy: 0.9426 The difference is small but indicates that over-structured templates may dilute linguistic cues required for decision reconstruction. This record is released as a pilot empirical observation and does not claim generalizable performance results. The purpose of this release is to document the benchmark design and provide a minimal empirical reference point for future work on decision transparency interfaces. Within the broader research programme, DER (Decision-Event Record) should be understood as a methodological instrumentation line derived from the TPT / TBT transparency architecture. This record is a scoped empirical note and does not restate the full TPT / TBT theory layer. Further experiments will explore DER variants designed to preserve narrative cues while exposing decision structure. Indexed in the project Root Index ( https://doi.org/10.5281/zenodo.17992916). Author: Hon Bor So Note: The author also publishes under the name Sing So (Chöndrel Dorje). ORCID: 0009-0008-2768-7494
Building similarity graph...
Analyzing shared references across papers
Loading...
Hon Bor So
Oldham Council
Building similarity graph...
Analyzing shared references across papers
Loading...
Hon Bor So (Sat,) studied this question.
www.synapsesocial.com/papers/69ada8c2bc08abd80d5bc0a0 — DOI: https://doi.org/10.5281/zenodo.18905162