Tool-augmented LLM sessions accumulate retrieved content that does not just waste context capacity — it actively degrades the model's ability to attend to relevant material. The "lost in the middle" phenomenon (Liu et al., 2024) and context rot research (Hong et al., 2025) demonstrate that model performance worsens as input length increases, even before the window fills. Current platforms retain everything until overflow, then discard silently. This concept note proposes a tiered retention architecture with three modes (full, summary, ephemeral), four recoverability classes, user-settable priority, and a retention note schema that makes every compression decision inspectable by the user. The security analysis identifies semantic cache poisoning as a specific risk when summarisation is performed on untrusted content, drawing on cross-model empirical findings from The Confidence Curriculum series (Phan, 2026). The proposal's core value does not depend on summarisation being secure: even without it, the metadata layer provides inspectability that does not exist today. The paper specifies the retention contract (invariants, schema, recoverability constraints). It does not prescribe a single default policy. Calibrating defaults by content type, task type, domain, and user profile is a platform-level product decision and a natural site of competitive differentiation. Upload contains two files with identical content: paginated PDF (archival format), continuous-scroll HTML with floating table of contents (reading format).
Building similarity graph...
Analyzing shared references across papers
Loading...
Ivan "HiP" Phan
Building similarity graph...
Analyzing shared references across papers
Loading...
Ivan "HiP" Phan (Wed,) studied this question.
www.synapsesocial.com/papers/69c620ab15a0a509bde19361 — DOI: https://doi.org/10.5281/zenodo.19212126