What question did this study set out to answer?

This concept note aims to address performance degradation in LLMs due to context fatigue and offers a novel retention architecture.

March 27, 2026Open Access

Strategic Forgetting: Ephemeral Retrieval and Tiered Context Persistence for LLM Inference

Key Points

This concept note aims to address performance degradation in LLMs due to context fatigue and offers a novel retention architecture.
Proposes a tiered retention architecture with three modes: full, summary, and ephemeral.
Establishes four recoverability classes and user-settable priorities for content retention.
Conducts a security analysis identifying risks like semantic cache poisoning with untrusted content.
Highlights the 'lost in the middle' phenomenon affecting model performance with increasing input lengths.
Emphasizes the importance of user inspectability in metadata layers to enhance retention decisions.
Does not prescribe a fixed retention policy, promoting flexible default settings based on content and user needs.

Abstract

Tool-augmented LLM sessions accumulate retrieved content that does not just waste context capacity — it actively degrades the model's ability to attend to relevant material. The "lost in the middle" phenomenon (Liu et al., 2024) and context rot research (Hong et al., 2025) demonstrate that model performance worsens as input length increases, even before the window fills. Current platforms retain everything until overflow, then discard silently. This concept note proposes a tiered retention architecture with three modes (full, summary, ephemeral), four recoverability classes, user-settable priority, and a retention note schema that makes every compression decision inspectable by the user. The security analysis identifies semantic cache poisoning as a specific risk when summarisation is performed on untrusted content, drawing on cross-model empirical findings from The Confidence Curriculum series (Phan, 2026). The proposal's core value does not depend on summarisation being secure: even without it, the metadata layer provides inspectability that does not exist today. The paper specifies the retention contract (invariants, schema, recoverability constraints). It does not prescribe a single default policy. Calibrating defaults by content type, task type, domain, and user profile is a platform-level product decision and a natural site of competitive differentiation. Upload contains two files with identical content: paginated PDF (archival format), continuous-scroll HTML with floating table of contents (reading format).

Read Full Paperexternally

AIに質問

Bookmark

View Full Paper