This public evidence package documents three internal Lumenais governed-context benchmarks: governed memory pressure, automatic memory promotion, and BFCL-derived tool-context compression. The package includes public-safe methods reports, aggregate figures, summary metrics, source-artifact hashes, checksums, and metadata. It is designed to make the reported case-study claims auditable without exposing raw row-level prompts, model responses, private logs, API keys, or proprietary routing details. The reports evaluate bounded capabilities of a governed continual-learning layer around frontier language models: preserving approved current memory under stale-context pressure, inferring current memory state before answer construction, and reducing visible tool context while preserving function-call quality. These materials document internal adversarial diagnostics and BFCL-derived stress tests. They do not claim external validation, official BFCL leaderboard standing, universal memory safety, broad reasoning superiority, model-weight learning, or customer-production validation.
Aaron Martinez (Tue,) studied this question.