What question did this study set out to answer?

This package aims to document and evaluate internal benchmarks related to memory management in language models.

May 28, 2026Open Access

Lumenais Governed Context Benchmarks: Public Evidence Package, May 2026

Key Points

This package aims to document and evaluate internal benchmarks related to memory management in language models.
Includes aggregate figures, summary metrics, and public-safe methods reports.
Evaluates capabilities of a governed continual-learning layer in language models.
Documents diagnostics and stress tests without exposing sensitive data.
Preserves approved current memory under stale-context pressure effectively.
Infers current memory state before answer construction with maintained function quality.
Reduces visible tool context while ensuring functional integrity.

Abstract

This public evidence package documents three internal Lumenais governed-context benchmarks: governed memory pressure, automatic memory promotion, and BFCL-derived tool-context compression. The package includes public-safe methods reports, aggregate figures, summary metrics, source-artifact hashes, checksums, and metadata. It is designed to make the reported case-study claims auditable without exposing raw row-level prompts, model responses, private logs, API keys, or proprietary routing details. The reports evaluate bounded capabilities of a governed continual-learning layer around frontier language models: preserving approved current memory under stale-context pressure, inferring current memory state before answer construction, and reducing visible tool context while preserving function-call quality. These materials document internal adversarial diagnostics and BFCL-derived stress tests. They do not claim external validation, official BFCL leaderboard standing, universal memory safety, broad reasoning superiority, model-weight learning, or customer-production validation.

Read Full Paperexternally

Mark Helpful

Bookmark

Relay

View Full Paper