What question did this study set out to answer?

The aim is to present CGPA, an architecture that supports persistent operational memory for task-oriented conversational AI without cloud dependency.

June 15, 2026Open Access

Context‑Gated Pattern Accumulation (CGPA): A Design Artefact for Persistent, Privacy‑Preserving AI Memory in Task‑Oriented Conversational Systems

Key Points

The aim is to present CGPA, an architecture that supports persistent operational memory for task-oriented conversational AI without cloud dependency.
Designed CGPA as a lightweight, edge-resident architecture.
Utilized a relational database for declarative context definition and on-device pattern caching.
Conducted trials in various deployments to assess cache performance and latency metrics.
Achieved 15–45% cache hit rate at 0–50 transactions, 70–90% at 50–100, and over 95% beyond 100 transactions per context.
Latency for Tier 1 resolutions measured under 20ms, IoT command latency under 150ms.
Total on-device resource usage is approximately 18MB disk and 1.2MB RAM.

Abstract

Task-oriented conversational AI deployments in edge environments face a constraint combination that existing architectures address only partially: persistent operational memory without cloud retraining, sub-second response latency on commodity hardware, structural privacy enforcement, and direct physical-world actuation. This paper presents Context-Gated Pattern Accumulation (CGPA) as a Design Science Research artefact — demonstrated across one production trial (MAOI), two active development deployments (BANOI, Virtual Carer V2), one planned deployment (Botler), and a 2018 architectural precursor confirming the foundational design principle. CGPA is a lightweight, edge-resident conversational orchestration architecture for bounded task-oriented domains. Its contexts are defined declaratively in a relational database. It accumulates successful intent resolutions, initially resolved by a cloud LLM, into a persistent on-device pattern cache through Machine Teaches Machine (MTM), and maps confirmed intents directly to physical-world actuator commands via IoT device registries. The architecture is specified through six formal definitions and five testable propositions grounded in Hevner et al.'s (2004) Design Science Research framework. Empirically observed convergence data from the MAOI FnB vertical: 15–45% cache hit rate at 0–50 transactions, 70–90% at 50–100, and 95%+ with under 5% LLM dependency beyond 100 transactions per context. Tier 1 resolution latency: under 20ms. IoT command latency: under 150ms. Total on-device footprint: approximately 18MB disk, 1.2MB RAM.

Context‑Gated Pattern Accumulation (CGPA): A Design Artefact for Persistent, Privacy‑Preserving AI Memory in Task‑Oriented Conversational Systems

Key Points

Abstract

Cite This Study