Task-oriented conversational AI deployments in edge environments face a constraint combination that existing architectures address only partially: persistent operational memory without cloud retraining, sub-second response latency on commodity hardware, structural privacy enforcement, and direct physical-world actuation. This paper presents Context-Gated Pattern Accumulation (CGPA) as a Design Science Research artefact — demonstrated across one production trial (MAOI), two active development deployments (BANOI, Virtual Carer V2), one planned deployment (Botler), and a 2018 architectural precursor confirming the foundational design principle. CGPA is a lightweight, edge-resident conversational orchestration architecture for bounded task-oriented domains. Its contexts are defined declaratively in a relational database. It accumulates successful intent resolutions, initially resolved by a cloud LLM, into a persistent on-device pattern cache through Machine Teaches Machine (MTM), and maps confirmed intents directly to physical-world actuator commands via IoT device registries. The architecture is specified through six formal definitions and five testable propositions grounded in Hevner et al.'s (2004) Design Science Research framework. Empirically observed convergence data from the MAOI FnB vertical: 15–45% cache hit rate at 0–50 transactions, 70–90% at 50–100, and 95%+ with under 5% LLM dependency beyond 100 transactions per context. Tier 1 resolution latency: under 20ms. IoT command latency: under 150ms. Total on-device footprint: approximately 18MB disk, 1.2MB RAM.
TRUONG VIET PHAN (Tue,) studied this question.