What question did this study set out to answer?

This research aims to address the resource overhead and technological sovereignty issues in deploying large language models in enterprise environments.

June 28, 2026Open Access

Delentia OS: The Intent-Centric AI Operating System Architecture for Local Edge VRAM Optimization

Key Points

This research aims to address the resource overhead and technological sovereignty issues in deploying large language models in enterprise environments.
Developed the Delentia OS architecture to facilitate local edge computations.
Implemented a dynamic Low-Rank Adaptation (LoRA) swapping scheduler and a differential memory retention method.
Conducted empirical verification with property-based testing across 205,999 regression examples.
Achieved sub-12ms dynamic LoRA swapping, averaging 11.2ms for adapter hot-swaps.
Reduced VRAM footprint by 74.2% and lowered repeated query inference costs by up to 99.4%.
Attained a 0.00% syntax error rate in structured JSON generation during automated workflows.

Abstract

Abstract— The deployment of Large Language Models (LLMs) in enterprise environments is currently bottlenecked by extreme resource overhead, variable API pricing, and a lack of technological sovereignty. This paper presents Delentia OS, an intent-centric AI operating system architecture designed to run entirely on local edge consumer hardware. Driven by the JITNA (RFC-001) protocol and an underlying mathematical governance framework (F = DI * A), Delentia OS utilizes a frozen 8-billion parameter base model coupled with a dynamic Low-Rank Adaptation (LoRA) swapping scheduler. Key Architectural Highlights: • Sub-12ms Dynamic LoRA Swapping: Hot-swaps four specialized cognitive adapters (Router, Guardian, Executor, Scribe) within local VRAM in under 12 milliseconds (actual avg. 11. 2ms). • Differential Memory Retention (Delta Engine): Implements ALGO-41 to compress long-term semantic memory, reducing context retrieval VRAM footprint by 74. 2% and lowering repeated query inference costs by up to 99. 4%. • Deterministic Mathematical Governance: Regulates autonomous agentic workflows via the FDIA authorization gate check (F = DI * A), guaranteeing absolute transaction-level safety. Empirical Verification & Benchmarks: • Zero-Crash Invariant Verification: Evaluated via property-based testing across 205, 999 regression examples with zero software crashes (0. 00% crash rate). • Absolute Execution Precision: Achieves a 0. 00% syntax error rate in structured JSON generation and automated tool-calling workflows. Keywords— Intent-Centric AI, Edge Computing, Dynamic LoRA Swapping, VRAM Optimization, Cognitive Operating System, SignedAI Consensus, Technological Sovereignty, JITNA Protocol, Small Language Models (SLM).

Read Full Paperexternally

Demander à l'IA

Bookmark

View Full Paper